What Did I Learn from Building LLM Applications in 2024? — Part 1
What Did I Learn from Building LLM Applications in 2024? — Part 1
An engineer’s journey to building LLM-native applications
Large Language Models (LLMs) are poised to transform the way we approach AI and it is already being quite noticeable with innovative designs of integrating LLMs with web applications. Since late 2022, multiple frameworks, SDKs and tools have been introduced to demonstrate the integration of LLMs with web applications or business tools in format of simple prototypes. With significant investments flowing into creating Generative AI-based applications and tools for business use, it is becoming essential to bring these prototypes to production stage and derive business value. If you’ve set out to spend your time and money in building an LLM-native tool, how do you make sure that the investment will pay off in long-term?
In order to achieve this, it is crucial to establish a set of best practices for developing LLM applications. My journey in developing LLM applications in the past year has been incredibly exciting and full of learning. With nearly a decade of experience in designing and building web and cloud-native applications, I’ve realized that traditional product development norms often fall short for LLM-native applications. Instead, a continuous cycle of research, experimentation, and evaluation proves to be far more effective in creating superior AI-driven products.
In order to help you navigate the challenges of LLM applications development, I will talk about best practices in the following key focus areas — use case selection, team mindset, development approach, responsible AI and cost management.
Ideation: Choosing the right use case
Does every problem require AI for solution? The answer is a hard ‘no’. Rather ask yourself that which business scenario can benefit most from leveraging LLMs? The businesses need to ask these questions before setting out to build an app. Sometimes, the right use case is right in front of us, other times talking to your co-workers or researching within your organization can point you to the right direction. Here are few aspects that might help you to decide.
Does the proposed solution has a market need? Conduct a market research for the proposed use case to understand the current landscape. Identify any existing solution with or without AI integration, their pros and cons, and any flaws that your proposed LLM application could fill. This involves analyzing competitors, industry trends, and customer feedback.Does it help the users? If your proposed solution aims to serve users within your organization, a common measure of user expectations is to check if the solution can enhance their productivity by saving time. A common example is IT or HR support chatbot to help employees with day-to-day queries about their organization. Additionally, conducting a short survey with potential users can also help to understand the pain points that can be addressed with AI.Does it accelerate business processes? Another type of use cases might be addressing business process improvement, indirectly impacting users. Examples include sentiment analysis of call center transcripts, generating personalized recommendations, summarizing legal and financial documents etc. For this type of use cases, implementing automation can become a key factor to integrate LLM in a regular business process.Do we have the data available? Most LLM-native applications use RAG(Retrieval Augmented Generation) principle to generate contextual and grounded answer from specific knowledge documents. The root of any RAG based solution is the availability, type and quality of the data. If you do not have adequate knowledge base, or good quality data, the end result from your solution might not be up to the mark. Accessibility of the data is also important, as confidential or sensitive data might not always be available at your hand.Is the proposed solution feasible? Determining whether to implement the AI solution depends not only on the technical feasibility, but also on the ethical, legal and financial aspects. If sensitive data is involved, then privacy and regulatory compliance should also be taken into consideration before finalizing the use case.Does the solution meet your business requirements? Think about the short-term and long-term business goals that your AI solution can serve. Managing expectations is also crucial here, since being too ambitious with short-term goals might not help with value realization. Reaping the benefits from AI applications is usually a long-term process.
Setting right expectations
Along with choosing the use case, the product owner should also think about setting the right expectations and short, attainable milestones for the team. Each milestones should have clear goals and timeline defined and agreed upon by the team, so that stakeholders can review the outcome in a periodic manner. This is also crucial to make informed decision on how to move forward with the proposed LLM-based solution, productionizing strategy, onboarding users etc.
Experimentation: Adopting the right ‘mindset’
Research and experiments are at the heart of any exercise that involves AI. Building LLM applications is no different. Unlike traditional web apps that follow a pre-decided design that has little to no variation, AI-based designs rely heavily on the experiments and can change depending on early outcomes. The success factor is experimenting on clearly defined expectations in iterations, followed by continuously evaluating each iteration. In LLM-native development, the success criteria is usually the quality of the output, which means that the focus is on producing accurate and highly relevant results. This can be either a response from chatbot, text summary, image generation or even an action (Agentic approach) defined by LLM. Generating quality results consistently requires a deep understanding of the underlying language models, constant fine-tuning of the prompts, and rigorous evaluation to ensure that the application meets the desired standards.
What kind of tech skill set do you need in the team?
You might assume that a team with only a handful of data scientists is sufficient to build you an LLM application. But in reality, engineering skills are equally or more important to actually ‘deliver’ the target product, as LLM applications do not follow the classical ML approach. For both data scientists and software engineers, some mindset shifts are required to get familiar with the development approach. I have seen both roles making this journey, such as data scientists getting familiar with cloud infrastructure and application deployment and on the other hand, engineers familiarizing themselves with the intricacies of model usage and evaluation of LLM outputs. Ultimately, you need AI practitioners in team who are not there just to ‘code’, rather research, collaborate and improve on the AI applicability.
Do I really need to ‘experiment’ since we are going to use pre-trained language models?
Popular LLMs like GPT-4o are already trained on large set of data and capable of recognizing and generating texts, images etc., hence you do not need to ‘train’ these types of model. Very few scenarios might require to fine-tune the model but that is also achievable easily without needing classical ML approach. However, let’s not confuse the term ‘experiment’ with ‘model training’ methodology used in predictive ML. As I’ve mentioned above that quality of the application output matters. setting up iterations of experiments can help us to reach the target quality of result. For example — if you’re building a chatbot and you want to control how the bot output should look like to end user, an iterative and experimental approach on prompt improvement and fine-tuning hyper parameters will help you find the right way to generate most accurate and consistent output.
Build a prototype early in your journey
Build a prototype (also referred to as MVP — minimum viable product) with only the core functionalities as early as possible, ideally within 2–4 weeks. If you’re using a knowledge base for RAG approach, use a subset of data to avoid extensive data pre-processing.
Gaining quick feedback from a subset of target users helps you to understand whether the solution is meeting their expectations.Review with stakeholders to not only show the good results, also discuss the limitations and constraints your team found out during prototype building. This is crucial to mitigate risks early, and also to make informed decision regarding delivery.The team can finalize the tech stack, security and scalability requirements to move the prototype to fully functional product and delivery timeline.
Determine if your prototype is ready for building into the ‘product’
Availability of multiple AI-focused samples have made it super easy to create a prototype, and initial testing of such prototypes usually delivers promising results. By the time the prototype is ready, the team might have more understanding on success criteria, market research, target user base, platform requirements etc. At this point, considering following questions can help to decide the direction to which the product can move:
Does the functionalities developed in the prototype serve the primary need of the end users or business process?What are the challenges that team faced during prototype development that might come up in production journey? Are there any methods to mitigate these risks?Does the prototype pose any risk with regards to responsible AI principles? If so, then what guardrails can be implemented to avoid these risks? (We’ll discuss more on this point in part 2)If the solution is to be integrated into an existing product, what might be a show-stopper for that?If the solution handles sensitive data, are effective measures been taken to handle the data privacy and security?Do you need to define any performance requirement for the product? Is the prototype results promising in this aspect or can be improved further?What are the security requirements does your product need?Does your product need any UI? (A common LLM-based use case is chatbot, hence UI requirements are necessary to be defined as early as possible)Do you have a cost estimate for the LLM usage from your MVP? How does it look like considering the estimated scale of usage in production and your budget?
If you can gain satisfactory answers to most of the questions after initial review, coupled with good results from your prototype, then you can move forward with the product development.
Stay tuned for part 2 where I will talk about what should be your approach to product development, how you can implement responsible AI early into the product and cost management techniques.
Please follow me if you want to read more such content about new and exciting technology. If you have any feedback, please leave a comment. Thanks 🙂
What Did I Learn from Building LLM Applications in 2024? — Part 1 was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.