Load

BigQuery offers a convenient way to query and analyze large amounts of data.

Info

In case you are new to BigQuery, you might want to:

Data schema

To load a table on BigQuery, you need to specify its schema. CreateSchema.py (python CLI) generates this schema for you. Take care to set the --prepare-names / --no-prepare-names option to the value set when you serialized the data.

python  bin/create-schema.py \
--prepare-names \
path/to/schema.json  # destination file

Load table

Tip

For the sake of efficiency, load the serialized files to a Google Storage bucket beforehand

gsutil -m cp -r path/to/folder/ gs://your-bucket/
bq load --source_format=NEWLINE_DELIMITED_JSON \
--ignore_unknown_values \
--replace \
--max_bad_records=10 \
project:dataset.table \
path/to/EP*.jsonl \ # gs://your-bucket/EP*.jsonl recommended
path/to/schema.json