For further details of the serialized format of protocol buffer messages, please see their website. The generated files use a Java package of crosby. File format A file contains a header followed by a sequence of fileblocks. The design is intended to allow future random-access to the contents of the file and skipping past not-understood or unwanted data.

Please note that BlobHeader used to be called BlockHeader. It was renamed in v1. Depreciated in To simplify decoder implementations, bzip2 has been deprecated and LZMA has become relegated to a proposed extension. In order to robustly detect illegal or corrupt files, I limit the maximum size of BlobHeader and Blob messages.

Every fileblock must have one of these blocks before the first 'OSMData' block.

See osmformat. These contain the entities.

This design lets other software extend the format to include fileblocks of additional types for their own purposes. Parsers should ignore and skip fileblock types that they do not recognize. This is done by required features. If a file contains a required feature that a parser does NOT understand, it must reject the file with an error, and report which required features it does not support. Currently the following features are defined: "OsmSchema-V0.

In addition, a file may have optional properties that a parser can exploit. For instance, the file may be pre-sorted, and not need sorting before being used.

Or, the ways in the file may have bounding boxes precomputed. If a program encounters an optional feature it does not know, it can still safely read the file. If a program expects an optional feature that is not there, it can error out. Geographic" — Entities are in some form of geometric sort.

What are the replication fields for? Osmosis is the software used to produce the daily, hourly, and minutely diffs on the planet.

To append updates to a PBF file, one has to know which replication state the file represents so that the right synchronisation point can be found.

PBF Format

Technically this is the internal database timestamp of the last transaction fully contained in the file; it does not necessarily mean that every object with a timestamp smaller than or equal to this timestamp is contained in the file. This usually matches the timestamp - if you know one you can find out the other, however it makes things easier for the consumer to know both.

When processing a PBF file, you would usually keep these fields intact i.

This means that you cannot safely store a useful string in that slot. It is not necessary but might have positive effect on the performance if you sort the string table that way that frequently used strings have small indexes.

You also might improve deflate compressibility of the stringtable if you sort strings that have the same frequency lexicographically. Each PrimitiveBlock is independently decompressable, containing all of the information to decompress the entities it contains. It contains a string table, it also encodes the granularity for both position and timestamps. A block may contain any number of entities, as long as the size limits for a block are obeyed.

It will result in small file sizes if you pack as much entities as possible into each block. However, for simplicity, certain programs e.

In addition to granularity, the primitive block also encodes a latitude and longitude offset value. These values, measured in units of nanodegrees, must be added to each coordinate. The explanation of the equation for longitude is analogous.