Generative AI has become a pivotal force in enterprise contexts, driving demands across industries, including highly regulated sectors like finance or healthcare.
In McKinsey’s 2024 Global survey, The state of AI in early 2024: Gen AI adoption spikes and starts to generate value, 65% of organizations reported to use Generative AI on a regular basis, nearly doubling the percentage from a survey conducted 10 months prior.
However, as more organizations leverage Generative AI, their challenge doesn’t just lie in deploying a model but in adopting AI effectively and responsibly. Success depends on many factors – such as data quality, governance, and security – especially for companies in regulated industries where precision and compliance are non-negotiable.
These complex demands lead to a key dilemma: is it better to opt for pre-built, ready-to-use AI solutions or invest in building a model from scratch? Alternatively, could a middle-ground solution like custom AI be the right compromise?
A custom AI model offers a strategic, budget-friendly blend of adaptability, practicality, and security, making it an effective solution for organizations in search of tailored AI without the resource-intensive demands of full-scale development.
This article explores the pros and cons of buying or building AI and examines why custom AI models might be the right choice for companies in regulated industries.
What is an AI model?
To identify which model is the right one for your company, it is important to understand the basis of AI. If you already know what an AI model is, how it’s built and trained, you can go straight to the AI Adoption: options and approaches section.
An AI model is a computational system trained to perform tasks like prediction, classification, or content generation by identifying patterns in data sets. These models apply different transformations to relevant data inputs to achieve the desired tasks or output.
AI models operate autonomously, based on the data they were trained on. The quality of the training data directly affects the model’s accuracy, a critical factor for companies in regulated industries, such as banking or insurance, where tasks like fraud detection demand unfailing reliability.
For example, training a model to detect whether an image depicts a branded or generic product involves feeding it labeled images of branded and unbranded products. This data enables the model to recognize and distinguish features.
Developing models for highly regulated industries involves distinct challenges, such as data availability, as confidentiality and privacy make it hard to have large datasets readily available. Moreover, these models must meet the exacting standards of organizations that have no margin for error. These considerations intensify the pressure for AI that must be both secure and accurate.
How to build an AI model
The lifecycle of a model is made of three phases:
Phase 1: Data preparation
This is the phase in which an organization collects, cleans, and evaluates the quality and biases of its data, to prepare it for the training process, ensuring compliance with organizational standards.
Phase 2: Training
In this phase, the model learns from the data (from Phase 1) with either supervised or unsupervised techniques. The “supervised learning” technique allows the model to train by observing inputs and outputs and learn from its mistakes. As per the image recognition example described above, in “supervised learning”, a branded/unbranded product recognition model starts with random responses, then adjusts its parameters (also known as weights) iteratively to improve accuracy. The algorithm behind this process – typically the Update Rule – uses backpropagation to adjust the model’s weights. Thanks to labeled data (ground truth), the model can identify and correct errors in its output. However, some models can also learn through unsupervised techniques that don’t rely on labeled data.
Phase 3: Deployment
Once the model is trained, the deployment phase happens when the model is ready to integrate into a business system. Deploying solutions like Large Language Models (LLMs) – 70B or bigger – is particularly challenging as it requires substantial computational resources. Instead, smaller models, like computer vision models, may run on embedded devices/IOT devices, which are computationally more limited. Deployment must also address latency issues, particularly in real-time applications. For instance, video-supported VLM (vision language model) must analyze and interpret visual content in order to instantly generate accurate language-based responses.
Meanwhile, offline inference, which does not involve real-time constraints, is less challenging performance-wise because it focuses more on throughput than latency. For example, Netflix’s recommendation system doesn’t update in real-time but at specific scheduled intervals or in the middle of the night.
How to train an LLM
Training an LLM involves three steps: pre-training, fine-tuning, and alignment.
1. Pre-training
Today, pre-training involves exposing the LLM to all the datasets found on the Internet (so trillions of tokens), to give the model “basic capabilities”. In other words, even a “basic” LLM version is equipped with broad knowledge from the Internet. In the pre-training phase, the model learns language distribution, and might complete determined patterns with answers that may or may not be relevant. For example, if the input in a model is “tic, tac…” the model might return “toe” based on patterns it understands, whether the output is relevant or not.
2. Fine-tuning
Fine-tuning allows the LLM’s behavior pattern to become more specialized. Fine-tuning requires much less data than pre-training and relies on a different set of algorithms. For supervised fine-tuning (SFT), the model is trained through input feeding alongside reference text used to guide the learning and fine-tuning process.
For example, if the input question is “what can you tell me about e-commerce in 2024?”, and the reference text is a 2024 e-commerce trend report, the model’s initial output might be irrelevant. The fine-tuning algorithm will then modify the model’s weights to align its responses with the data from the report. Through this iterative process, the model also learns to engage in natural, conversational dialogue.
3. Alignment
The final phase, the alignment phase, refines the model by adjusting responses to fit preferences, such as tone of voice or conciseness. Direct Policy Optimization (DPO) is a commonly used technique, which involves presenting the model with an input question and two responses, one accepted and one rejected.
For example, the input question could be “what can you tell me about e-commerce trends in 2024?” and the answers would be:
- Accepted (concise version): E-commerce in 2024 was driven by personalization algorithms, sustainable practices, and seamless omnichannel experiences.
- Rejected (wordy version): Of course I can tell you what I know about e-commerce trends in 2024. First of all, you must know that e-commerce...
According to McKinsey, there are three archetypes for companies to integrate AI into their operations.
Takers: Off-the-shelf AI solutions
Takers rely on pre-built AI solutions provided by external vendors, with no customization options whatsoever. These solutions work for companies that neither have the budget nor the infrastructure for a custom-built system, such as a small business looking to integrate a chatbot with a pay-for-use model. However, they do not allow the flexibility required by specialized use cases. Off-the-shelf solutions are mostly LLMs.
Pros
- Fast implementation
- No code required
- Minimal costs
Cons
- Data security risks
- Generic
How it works: Minimal infrastructure required, plug-and-play solution that processes prompts with external data.
Shapers: Fine-tuning
Shapers enhance pre-trained models with proprietary data to deliver tailored results. These AI models are usually deployed on private or on-premises infrastructure to ensure data privacy and compliance. Fine-tuning an AI model requires labeled data, making the solution expensive both in terms of installation and data acquisition. Those models are particularly useful for companies in regulated industries, such as wealth management firms leveraging AI for fraud detection.
Pros
- Improved data security and control
- Customized outputs relevant to business-specific tasks
- Scalable and adaptable to evolving needs
Cons
- Higher costs for infrastructure and specialized or technical expertise
- Low-code
- Complexity in integration with existing IT systems
How it works: The model is hosted within an organization’s infrastructure. It works by bringing data to the model, deploying the model on cloud platforms with data aggregation capabilities.
Makers: Building proprietary AI models
Makers develop AI models from scratch, tailoring every aspect to their unique requirements. This requires extensive internal data repositories, advanced AI expertise and significant computing resources. For example, a healthcare company might build a custom AI diagnostic model from scratch to meet specific and weighty demands that no other solution could handle.
Pros
- Total independence from external vendors
- Complete data security
- Model customization options
Cons
- Prohibitive costs
- Substantial time and resource investment (code writing needed)
- Available for organizations with deep technical skills only
All three cases feature Automatic Machine Learning (AutoML), which has now become the norm thanks to code intelligent enough to set up on its own. Until recently, tech specialists had to intervene and set the system by hand. In about 10% of exceptions that can’t rely on AutoML, technical staff can still use the same ML libraries with their basic components to create the specific set-up required.
Custom AI: The best of both worlds
For many companies, especially in regulated industries, custom AI balances the strengths of pre-built and proprietary models. Top-tier companies across industries have reported that leveraging custom AI accounted for 20% of their EBIT.
This strategy involves purchasing a customizable AI solution and enhancing it with proprietary data to meet a company’s specific needs. Instead of going through the expensive process of fine-tuning it, techniques like Retrieval-Augmented Generation (RAG) enable the extraction of relevant data from internal sources without retraining, optimizing performance in cost-effective ways.
Deploying the system on a client’s infrastructure – or in a provider’s cloud – ensures data privacy and security.
This custom AI approach is particularly beneficial for the following reasons.
Data security
Proprietary data remains protected within an organization’s infrastructure, reducing the risk of exposure to external threats. By deploying a custom AI model on secure, private systems, companies can rest assured that their sensitive information stays confidential within the confines of strict regulatory requirements. Moreover, organizations can have greater control over data access and management, as this approach minimizes the involvement of third parties.
Financial efficiency
While building an AI model from scratch is possible, it remains an incredible investment in terms of costs, expertise, computational resources, and time. A custom AI model can mitigate costs by leveraging a pre-built foundation, which can be fine-tuned to a business’ specific needs. This makes it possible for companies to benefit from advanced AI capabilities without having to face the expenses of development.
Customization
Custom AI combines the strength of an off-the-shelf solution with the ability to tailor outputs to specific use cases. By integrating proprietary data and business-specific demands, organizations can obtain accurate, relevant, and context-aware results. Adding to it, the combination of supplier expertise and company-specific customization enables the AI solution to perform tasks required by the company, such as anomaly detection or personalized recommendations.
Unicorn: A custom AI solution
For organizations in highly regulated industries wondering whether to buy or build their own AI, Unicorn is a custom AI solution that can be deployed safely either on a client’s infrastructure or in iGenius’ cloud. Unicorn provides a complete package: a model, fine-tuning, and customer support, which allows clients to have their own proprietary model but with the expertise of iGenius’ team. By connecting a company’s data sources to a custom model through RAG (no code needed), companies can fine-tune their own proprietary model and get precise, reliable information tailored to their own business needs. With an AI-powered model that speaks the organization’s language, this approach allows companies to generate content and make informed decisions, in a safe and private way.
As companies lean more and more on AI, the decision to build or to buy depends on factors and limitations such as budget, technical resources, infrastructure, as well as security and privacy requirements. Custom AI solutions offer a balanced and practical approach, combining the efficiency of pre-built models with specialized proprietary enhancement.
The key to successfully adopting AI starts by assessing your company’s needs and demands. This will empower you to choose a solution that aligns with your long-term goals while safeguarding data and ensuring regulatory compliance.
Frequently Asked Questions
Custom LLMs are pre-trained models fine-tuned with proprietary data to align with specific organizational requirements. They are particularly beneficial because they are cost-effective, they keep a company’s data private and secure, and they can be trained to perform specific, company-centric tasks.
Inference is when a model generates output (responses or tasks) based on user-provided input. Based on training data, the model starts recognizing patterns even in data it’s never seen before (like in the case of a new user question), enabling reasoning and predicting in a way that mimics human abilities. So for example, after training to correctly identify an element, inference is when the model can finally identify – or infer – that same element from a random data set.
Natural Language Processing (NLP) and LLMs are both essential to advance language capabilities in AI. NLP refers to the various techniques that computers use to understand and generate human language, focusing on the broader field of language semantics and structures. Meanwhile, LLMs are large-scale models trained on large amounts of data to predict which word or sentence is most likely to appear next in a sequence. In other words, they generate text based on statistical patterns learned from data. NLP actually understands human language, but LLMs have advanced prediction capabilities to generate coherent and context-specific text at scale, even though they cannot understand language per se.