Tech Innovation with LLMs Producing More Secure and Reliable Gen AI Results

Guest article from

Serena Wellen | Senior Director of Product Management, LexisNexis

The introduction of Generative Artificial Intelligence (Gen AI) tools specifically designed for the legal profession is stirring animated conversations about the potential for these tools to transform the way that law is practiced, but perhaps less understood is how the technology making these tools possible is getting better and more reliable.

Gen AI describes Large Language Models (LLMs) that are designed to create new content in the form of images, text, audio and more. This is the category of AI from which emerged ChatGPT, the model launched in November 2022 that brought Gen AI into the cultural mainstream.

The initial model, GPT-2, was built on 1.5 billion parameters of data inputs. The subsequent model, GPT-3, was built on 175 billion parameters and GPT-4 may have been built on an astonishing 170 trillion parameters. But as staggering as this rapid growth is, the truth is that LLMs may have peaked in size. Indeed, OpenAI's Sam Altman has indicated that “the age of giant AI models is already over” and that future versions will improve in different ways.

Of course, the early-stage versions of these LLMs produced some results that amazed legal professionals with their possibilities, and other results that alarmed them because of the risks. But over the course of the past year, there has been tremendous innovation in LLM technology that is clearly driving Gen AI in the right direction.

For one thing, the gap between private models (e.g., those from OpenAI, Google, Anthropic, Microsoft, etc.) and open source models (e.g., Llama, Falcon, Mistral, etc.) is narrowing. This is important because the open source ecosystem is driving a huge amount of innovation, fueled by easier access to the models themselves, easier availability of training data sets for everyone, lower costs, and the worldwide sharing of research to guide further development.

Second, prompt engineering has evolved to the point where it is much more akin to traditional software engineering. In the early days of Gen AI, the data science behind creating the back-end prompts to guide the models was untested, and few software engineers had the requisite training or experience. We now have a variety of tools — such as LangChain and PromptFlow — that are very similar to other tools and templates regularly used in software engineering, making it easier for developers to create Gen AI applications at scale.

Third, LLMs’ ability to reason and to minimise “hallucination” has become quite impressive with the proper techniques. One of these techniques is known as Retrieval Augmented Generation (RAG). The RAG model is an LLM prompt cycle that accesses information external to the model to improve its response to specific queries, rather than only relying upon data that was included in its training data. ChatGPT, for example, relies solely on its training data: information extracted from the open web (an unknown number of which may not be grounded in fact). The most advanced applications of the RAG approach, such as how we use RAG within our Lexis+ AI platform, can now deliver accurate and authoritative answers that are grounded in the closed universe of authoritative content — in our case, the most comprehensive collection of case law, statutes and regulations in the legal industry.

“With the right model training, source materials and integration, RAG is poised to mitigate, if not resolve, some of generative AI’s most troubling issues,” reported Forbes.

Another important dimension of tech innovation with LLMs is that more organisations are now deploying a “multi-model” approach to building their Gen AI solutions. This shift away from placing big bets on a single LLM is enabling developers to leverage different benefits from different models, creating their own solutions in a more flexible way that maximises functionalities and minimises risks.

And an interesting development to keep your eyes on in the year ahead is the potential evolution of LLMs with something called Large Agentic Models (LAMs). LAMs are advanced systems that can perform tasks and make decisions by interfacing with other human users or other automated tools. Unlike traditional AI systems that respond to user prompts, LAMs are designed to understand their environment and take actions to achieve their assigned goals without direct human intervention, according to TechTarget.

But perhaps the most important technology innovation with LLMs for legal professionals is that data security and privacy safeguards are being placed front and center with the newest tools in development. Secure cloud services are more readily available, data sanitation and anonymisation are standard in training models, encryption is more reliable than ever, access controls are vastly improved, and there are sound data governance protocols around the retention of prompt inputs and response outputs.

At LexisNexis, we have followed a product development plan that embraced Gen AI technology in a deliberate manner so we can capture the upside of these tools developed specifically for the legal domain, while mitigating the potential risks associated with the first generation of the open Web Gen AI tools, such as ChatGPT.

Lexis+ AI is our breakthrough Gen AI platform that is transforming legal work by providing a suite of legal research, drafting, and summarisation tools that delivers on the potential of Gen AI technology. Its answers are grounded in LexisNexis' extensive repository of accurate and exclusive legal content with industry-leading data security and attention to privacy. By saving time with Lexis+ AI enabled tasks, legal professionals have more time to do the work only they can do. In fact, our customers have reported time savings of up to 11 hours per week using Lexis+ AI.