Equinor Volve Field Dataset · Parquet Format
High-performance columnar storage · Ready for analysis · Compressed & optimized
This repository provides the Equinor Volve field data in Parquet format, converted from the original dataset. The Volve field was Norway's first fully disclosed oil field dataset, offering real-world data for data engineering and analysis workflows.
The data has been structured into three normalized tables: daily metrics, monthly aggregations, and well metadata. Perfect for learning SQL, data engineering, or building analytical pipelines.
Query the Parquet files directly without loading into a database. Here's a simple example using DuckDB:
import duckdb # Connect to DuckDB (in-memory) con = duckdb.connect() # Query daily production data result = con.execute(""" SELECT w.wellbore_name, SUM(d.oil_volume) as total_oil, SUM(d.gas_volume) as total_gas, SUM(d.water_volume) as total_water FROM 'daily_production.parquet' d JOIN 'wells.parquet' w ON d.npd_wellbore_code = w.npd_wellbore_code WHERE d.date BETWEEN '2008-01-01' AND '2008-12-31' GROUP BY w.wellbore_name ORDER BY total_oil DESC LIMIT 10 """).fetchall() for row in result: print(row)
No database setup required. DuckDB reads Parquet files directly and efficiently, making it perfect for exploratory analysis and prototyping.
The dataset consists of three interconnected tables tracking well production metrics at different time granularities.