USE CASE / R&D / PRODUCT DEVELOPMENT

Finding patterns in vast amounts of multi-modal data

AI can be used to optimise R&D success rates, reduce costs and mitigate risks associated with high risk, high reward exploration processes.

FUNCTION: R&D; INDUSTRY: MINING

Extract key data & trends from petabytes of data to make better decisions, faster

For research intensive sectors such as healthcare, pharmaceuticals, energy and resources, R&D is resource intense, human bound, lengthy, risky and often doesn’t translate to ROI. AI can supplement human oversight.

    • 30+ yrs of mutli-modal, disconnected geotech data for a oil and gas developer. Data in text, numbers, handwriting, pictures, graphs, lab notes, seismic surveys, emails, well logs, core samples, production data etc.

    • Numerous data silos.

    • Instances of low volume and poor-quality seismic data.

    • Low success and high cost of R&D meant risk appetite had reduced, leading to competitive disadvantage.

    • Teams lacked real-time visibility into historical progress, cost and technical risk.

    • Challenge was to use technology to ingest 30+ yrs of data to draw patterns and trends, leading to better future R&D outcomes.

    • Use our Consulting approach to fully understand the problem, root causes, goals and relevant data.

    • Deeply understand the current state process, and human review and thought process, which includes the rules and decision points.

    • Clean, structure and optimise data.

    • Where necessary and agreed, create synthetic data to augment.

    • Confirm our hypothesis to solve problem with AI approach.

    • Rapid PoC to stakeholders. Use PoC feedback to iterate. Agile approach minimises risk, ensures tool evolves to meet the genuine user needs.

  • Missing, poor quality data

    AI used to generate new data samples that resemble the patterns and characteristics of the existing seismic data, and improve data quality through denoising or resolution enhancement.

    Document ingestion & OCR

    Use OCR and layout models LayoutLMv3 and Donut to read files.

    Document classification

    Tag documents by type using open model. NLP for project classification and mapping. LangChain for data retrieval and contextual linking.

    Predictive modeling & sim

    Train ML models on lab and production data to predict material performance, failure rates or process yields. Also use TensorFlow.

    Generative design

    Generate designs that meet specified constraints.

    Time-series forecasting

    For predicting experiment durations, resource usage and demand for prototypes using ARIMA and Temporal Fusion Transformer.

    Predicting ROI

    Use inference models to understand which factors drive success, and Bayesian optimisation & uncertainty modeling that helps allocate resources to the most promising research streams.

    Q&A summary

    Let users ask queriesusing RAG + LLM.

    Full audit trail

    Automatically log extracted entities, reviewer comments and AI reasoning with LangChain and AnythingLLM.

    Data insights

    Summarise compliance coverage and trends in Tableau.

    Human reviewers validate model outputs. This is extremely important when validating geoscience outputs.

    Infra

    On-prem GPU compute to ensure engineered data sovereignty and zero data leakage risk.

    • 25% improvement in ROI per $ invested.

    • 40% shorter development cycle times.

    • 30% cost saving on trials and testing.

    • Better project risk management.