Practice

Install a Local AI

Run a powerful AI on your own computer. Analyze confidential documents, draft memos from sensitive facts, and summarize case files — without sending a single word to the cloud. This guide gets you from zero to a working local AI in under 30 minutes.

Why Local AI Matters for Lawyers

As a lawyer, you are the custodian of your client's most sensitive information. Attorney-client privilege, ABA Model Rule 1.6 on confidentiality, and your ethical obligations all point to one simple truth: you must control where client data goes.

The Problem with Cloud AI

When you paste a contract, a memo, or case details into a cloud-based AI tool like ChatGPT or Claude, that data leaves your machine and travels to a third-party server. Even with enterprise agreements, you are relying on someone else's infrastructure, someone else's data retention policies, and someone else's security team. For many types of confidential legal work, that is an unacceptable risk.

The Local AI Solution

A local AI runs entirely on your computer. No internet connection is needed. No data is transmitted anywhere. The model loads into your machine's memory, processes your input locally, and generates its response locally. Your documents never leave your desk.

Related reading: For a deeper look at the risks of putting confidential information into cloud AI tools, see What Not to Do #2 — Don't Paste Confidential Information Into AI Tools Without Safeguards.

What is LM Studio?

LM Studio is a free, cross-platform desktop application that lets you download and run open-source large language models (LLMs) directly on your computer. Think of it as having your own private ChatGPT — but one that runs offline and keeps everything local.

Completely Private

Your data never leaves your computer. No telemetry, no cloud sync, no API calls. Everything stays local.

Works Offline

After the initial model download, no internet connection is required. Use it on a plane, in a courthouse, or in a SCIF.

Cross-Platform

Available for Windows, macOS, and Linux. Runs on laptops with 16GB RAM or more.

Thousands of Models

Browse and download from thousands of open-source models on Hugging Face. Find the right one for your task and hardware.

Step-by-Step Installation

Follow these five steps to go from nothing to a working local AI. Most lawyers complete this in 20-30 minutes, depending on internet speed for the model download.

Download LM Studio

Go to lmstudio.ai and download the installer for your operating system. LM Studio is available for:

Windows 10/11 macOS (Apple Silicon & Intel) Linux (Ubuntu/Debian)

The download is approximately 400-500 MB. The application itself is free with no account required.

Install and Launch

Run the installer. On Windows, double-click the .exe file. On macOS, drag the app to your Applications folder. On Linux, follow the package instructions on the site.

When you launch LM Studio for the first time, you will see a clean interface with a search bar and a chat window. No configuration is needed yet.

Download a Model

Click the Discover tab (magnifying glass icon) in the left sidebar. Search for a model by name. For your first model, we recommend:

Recommended First Model

Llama 3.1 8B Instruct — Search for "llama 3.1 8b instruct" in the Discover tab. This is a strong general-purpose model that runs well on most modern laptops with 16GB of RAM.

The "8B" means 8 billion parameters. Larger models (70B) are more capable but require significantly more hardware — typically 64GB+ RAM or a dedicated GPU. Start small.

Click the download button next to the model. LM Studio will show you the recommended quantization (file size variant). The default is usually fine. The download will be 4-6 GB for an 8B model — this is a one-time download.

Load the Model and Start Chatting

Go to the Chat tab (speech bubble icon). At the top of the chat window, click the model selector dropdown and choose the model you just downloaded. LM Studio will load the model into memory — this takes 10-30 seconds depending on your hardware.

Once loaded, you can start typing prompts just like you would with ChatGPT. Try something simple first: "Summarize the key obligations of a party in a standard NDA." You should see the model generate a response in real time.

Use the Local Server (Advanced)

LM Studio includes a built-in local server feature. Click the Developer tab (code icon) and start the server. This creates an OpenAI-compatible API endpoint at http://localhost:1234 on your machine.

This lets other applications on your computer (note-taking apps, coding tools, document processors) connect to your local AI using the same interface they would use for cloud APIs — but all traffic stays on your machine. This is optional and not required for basic use.

Recommended Models for Legal Work

Not all models are equal. Here are four models we recommend for lawyers, ranging from lightweight to powerful. Start with the first one and upgrade as your hardware and confidence allow.

Llama 3.1 8B Instruct

Recommended

The best all-around choice for most lawyers. Strong general reasoning, good at following instructions, and capable of summarization, drafting, and analysis tasks. This is your go-to model for daily work.

16GB RAM minimum ~4-6 GB download General purpose

Mistral 7B Instruct

Fast and efficient. Mistral 7B punches above its weight class for summarization and text generation. If your primary need is quickly summarizing documents or generating first drafts, this is an excellent option that responds quickly.

16GB RAM minimum ~4 GB download Fast summarization

Phi-3 Mini

Microsoft's compact model. At only 3.8 billion parameters, it runs on older or less powerful hardware. Good for basic tasks: simple summarization, Q&A, and quick drafting. Not as capable as larger models for complex reasoning, but a solid entry point if your machine has limited resources.

8GB RAM minimum ~2 GB download Lightweight

Llama 3.1 70B

Power Users

The heavyweight. 70 billion parameters deliver substantially better reasoning, nuance, and accuracy. Closer to cloud model quality. But it requires serious hardware: 64GB+ RAM or a dedicated GPU with 48GB+ VRAM. Best for lawyers with workstation-class machines or IT-managed infrastructure.

64GB+ RAM or GPU ~40 GB download Near cloud quality

Practical Use Cases

Once you have LM Studio running, here are the most valuable ways lawyers are using local AI today. Each of these keeps your data completely on your machine.

Analyzing Contracts

Paste a full contract into the chat and ask the model to identify key obligations, termination provisions, indemnification clauses, or unusual terms. All analysis happens locally.

Summarizing Case Files

Feed in deposition transcripts, medical records, or discovery documents and ask for structured summaries, chronologies, or key fact extraction.

Drafting from Confidential Facts

Provide the model with case-specific facts and ask it to draft a memo, letter, or argument outline. The facts never leave your machine, so privilege is maintained.

Extracting Key Dates and Obligations

Ask the model to extract all deadlines, milestones, payment dates, and deliverables from a contract or settlement agreement into a structured table.

Comparing Document Versions

Paste two versions of a contract or agreement and ask the model to identify all differences, changed provisions, and new language. Useful for redline review when you need the analysis to stay confidential.

Limitations to Keep in Mind

Local AI is a powerful tool, but it is not a replacement for cloud models in every situation. Understand these trade-offs so you can choose the right tool for each task.

Smaller Models, Lower Capability

An 8B parameter model running locally is substantially less capable than GPT-4o or Claude 3.5 Sonnet running in the cloud. Expect simpler reasoning, occasional errors, and less nuanced output. Always verify.

No Internet Access

Local models cannot search the web, access legal databases, or retrieve current information. They work only with what you provide in the prompt and their training data (which has a knowledge cutoff date).

Slower Response Times

Local inference is slower than cloud APIs, especially on consumer hardware. A response that takes 2 seconds from ChatGPT might take 15-30 seconds from a local model. This is acceptable for careful analysis but less ideal for rapid iteration.

Output Still Requires Verification

A local model can hallucinate just like a cloud model. It can invent case citations, misstate legal standards, or produce plausible-sounding but incorrect analysis. The verification obligation is the same regardless of where the model runs.

The Smart Approach: Use Both

The most effective lawyers use cloud AI for non-confidential work (legal research, general drafting, learning) and local AI for confidential work (client documents, privileged analysis, sensitive facts). This gives you the best of both worlds: maximum capability when privacy is less critical, and maximum privacy when it matters most.

Keep Building Your AI Skills

Now that you have a local AI running, learn how to get the best results from it. Our prompt engineering guide and quick wins work just as well with local models as they do with cloud tools.

Learn Prompt Engineering Try Quick Wins Review What Not to Do

Ready for structured learning? Explore the Learning Program →

Comments

Loading comments...