Generative AI in Financial Institutions

Finding the Right Fit for Trust and Innovation

Jun 01, 2025

📌 THE POINT IS: Generative AI holds immense promise, but financial institutions must understand that no single model or approach fits every use case. By leveraging a mix of technologies—NLG tools for precision, RAG for reliable data retrieval, and LLMs for user interaction—organizations can create robust, trustworthy systems that balance innovation with risk management.

The world of Generative AI (GenAI) is evolving rapidly, and financial institutions are eager to harness its potential while grappling with the inherent risks it introduces. At the forefront of these concerns is a valid yet often misunderstood fear: the probabilistic nature of Large Language Models (LLMs), which can produce varied responses to the same query. This unpredictability may feel antithetical to the precision-driven world of finance (and government, law, et al), but the reality is that no single tool or approach can address every need.

The Risk-Averse Reality of Financial Institutions

Financial institutions and government entities live and breathe compliance, consistency, and trust. For them, the idea that an AI model may not produce the exact same response every time is daunting. It raises concerns about reliability, accountability, and the downstream implications of inaccuracies or "hallucinations" (the GenAI term for fabricating false or misleading information).

Recently, I witnessed a conversation where a colleague questioned a data scientist about why a GenAI model couldn’t deliver 100% reproducible results. The Data Scientist explained that as a probabilistic model, GenAI inherently will provide the same essential response, but not look 100% the same, word for word. A Feb 2024 whitepaper by Edward Kim, Isamu Isozaki, Naomi Sirkin, and Michael Robson at the Department of Computer Science at Drexel University (Kim, Isozaki, Sirkin) and Department of Computer Science at the Smith College in Massachusetts (Robson) goes much deeper:

As stated previously, in order to have an effective consensus model, and ultimately verify the correctness of machine learning tasks in a distributed, decentralized system, honest nodes must come to agreement and must agree on the same value. However, in the realm of machine learning and deep learning, the parallel nature of the computations can introduce a slight non-determinism. (§2.3) (https://arxiv.org/html/2307.01898v2)

They go on to explain how you can ensure that Machine Learning models are reproducible (both NLG and GenAI are Machine Learning models), but for many organizations, the answer will be to diversify the AI toolkit itself.

The Role of NLG and RAG Models

One of the most promising complements to LLMs is Natural Language Generation (NLG) technology, such as Arria Studio. Unlike LLMs, which rely on probabilities to generate responses, NLG tools follow predetermined templates and rules to deliver consistent, accurate outputs. This predictability is why tools like Arria Studio are invaluable in high-stakes environments such as finance. As the team at Arria highlights,

"NLG ensures not only accuracy but also reliability, making it ideal for sectors where consistency is paramount." - Arria NLG

This makes them ideal for use cases that demand reliability, such as summarizing financial reports, generating compliance documents, or crafting templated communication.

"In scenarios requiring precision and repeatability, NLG tools excel by ensuring accuracy and adherence to format." - Arria NLG

Another emerging solution is Retrieval-Augmented Generation (RAG). RAG combines the power of AI with precise data retrieval mechanisms. For instance, BloombergGPT—a finance-specific language model—utilizes RAG techniques to ensure its outputs are not only fluent but also deeply grounded in accurate financial data. Such real-world applications demonstrate how RAG models can serve as a trusted backbone in industries where data integrity is paramount. Instead of relying solely on the model to "remember" facts (a common flaw of LLMs), RAG models fetch relevant data from trusted sources and process it intelligently.

This layered approach allows institutions to maintain control over the source and accuracy of data while benefiting from the fluid, conversational interface LLMs provide. In a September 2024 Harvard Business Review publication, they discuss this interaction as a crucial element for companies that need the highest fidelity in responses.

"Retrieval-Augmented Generation (RAG) is emerging as a preferred customization technique for businesses to rapidly build accurate, trusted generative AI applications. RAG is a fast, easy-to-use approach that helps reduce inaccuracies (or 'hallucinations') and increases the relevance of answers." (https://hbr.org/sponsored/2024/09/the-popular-way-to-build-trusted-generative-ai-rag)

A Future of Collaboration Between Models

Generative AI isn’t an either/or scenario—it’s an ecosystem. Intelligent systems of the future will require multiple models working together, each playing to its strengths. Consider this:

NLG Tools like Arria Studio provide the predictability and consistency required for regulatory compliance and document generation.
RAG Models ensure that data retrieval is reliable, minimizing the risk of hallucinations by grounding responses in vetted, structured sources.
LLMs deliver the human-like interaction that creates intuitive, engaging user experiences.

Blending these technologies with traditional coding practices allows organizations to tailor AI solutions to specific needs. This is the path forward for financial institutions (and others!): a deliberate, well-engineered synergy of tools and methods.

Final Thoughts: Investment with Intent

As the financial world accelerates toward digital transformation, the enthusiasm for GenAI must be matched by an understanding of its limitations and opportunities. NLG models, far from being outdated, remain indispensable for precision and control. RAG models provide a reliable backbone for trustworthy information, while LLMs bridge the gap with conversational capabilities that engage users.

As we step into this exciting new era, organizations must resist the urge to view GenAI as a one-size-fits-all solution. Instead, they should build multi-faceted systems where each model and tool works together seamlessly, balancing innovation with accountability.

Matt Brooks is a seasoned thought leader and practitioner in data and analytics; culture; product development; and transformation.

Thanks for reading! This post is public so feel free to share it.

Discussion about this post

Ready for more?