About Renee and Culminate Strategy
Renee Bourgeois, a data consultant with Culminate Strategy, works with one of the world's leading providers of products and services to the energy industry to assess their oil and gas well performance of around 500 wells in the Gulf of Mexico, critical information to successfully run their business. She is working with 59 million searchable files and billions of data entries.
The data from oil and gas wells is large, messy, inconsistent, and unwieldy. This includes duplicative entries, incorrect data input, orphaned wells without identifying information, no standard names, inconsistent entries, and thousands of datasets without a common link. Renee’s current tools to review, validate, and wrangle the data are time and labor intensive and often result in unusable data after weeks of work. Additionally, Renee was unable to use her analytics systems to access and explore the data because loading the data resulted in timeouts and crashes.
Renee used DataChat to load datasets directly from her DataBricks database. For her large (~6 billion row) datasets, DataChat can deliver a high-level summary within 30 seconds. Using the Sample skill, she then quickly explored a representative and manageable subset of that data without having to wait for the entire dataset to be processed.
For her datasets containing millions of rows, she can load and explore the entire dataset within minutes, again using DataChat to review the mean, median, and range of each column to ensure the data was within expected parameters, while visualizing the distribution of data via helpful histograms. With initial exploration complete, she used DataChat to remove data entries without meaningful information and then leveraged DataChat's automatic visualization features to examine her most important KPIs and let DataChat choose the best visualizations for her.
Using DataChat, Renee was able to quickly access, review, clean, and validate her data, saving her days (and sometimes weeks) of work to establish a usable, accurate dataset she could use for her analysis of oil well performance.
Renee was able to assess that the data wasn’t consistent, accurate, or salvageable within a couple of hours. Previously, it could have taken her days, if not weeks, to assess a dataset with millions of rows.
She was able to deliver high quality analysis faster than she ever could before, devoting the time she saved to critically thinking about the meaning of the data rather than trying to make the data usable.
“I was able to use DataChat to connect to Databricks and quickly perform data discovery that was previously laborious and time consuming. Routine tasks that previously took days for me to perform manually, I was able to do in minutes. Within 30 seconds of connecting, I was able to inspect a Databricks table schema and begin exploring a dataset with 6 billion rows.”
Renee Bourgeois, Data Consultant