Giving Microsoft Phi-4 LLM model a try


Microsoft has officially released MIT licensed Phi-4 model. It is available on hugging face https://huggingface.co/microsoft/phi-4.

Phi 4 is a 14B parameter, state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets.

I wanted to give it a try so I used Ollama on runpod.io. You can follow the instructions mentioned here https://docs.runpod.io/tutorials/pods/run-ollama

I used 4-bit quantized model on ollama. You can also try 8-bit and fp16 versions as well. As I mentioned in my last blog 4-bit quantization strikes a good balance between performance and efficiency. I also tried 8bit quantized model but they both worked same for me.

Once ollama is installed I ran 4bit quantized model

ollama run phi4

The model size is 9.1 GB, and it will consume 28 GB of RAM. I chose the A40 GPU on Runpod.

Once Ollama has pulled the model, you can start typing your queries. Ollama has both a CLI interface and an API interface.

root@3180bdae25b1:/# ollama run phi4
>>> who are you?
I'm Phi, a language model developed by Microsoft. I'm designed to assist with answering questions, providing information, and engaging in conversation across a wide range
of topics. My main goal is to help users by offering clear, accurate, and helpful responses while ensuring your privacy and security are respected. If you have any
questions or need assistance, feel free to ask!

Then, I asked about knowledge cut off date

>>> what's your knowledge cutoff date
My knowledge is current up until October 2023. This means that any information or events beyond this date are not included in my training data, and I may not have the
latest updates on developments after that time. If you need up-to-date information on topics post-October 2023, I recommend consulting reliable and current sources. Let
me know if there's anything else you'd like to explore within my knowledge range!

I use language models for a variety of tasks, with the most common being coding. I generate single-purpose HTML/JS web applications to assist me with my work. You can find them here: https://tools.o14.ai/. All these applications are entirely generated using ChatGPT (GPT-4o model).

One of the web app tools I built is to diff two texts https://tools.o14.ai/text-diff.html.

It also has Analyze Differences feature where we use LLM to compare two inputs.

I wanted to see if Phi 4 model can help generate similar web app. Using ollama CLI is pain so I used https://github.com/open-webui/open-webui project.

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.

You can run Open WebUI locally using Docker. There are other options as well.

docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://foobar-11434.proxy.runpod.net/ -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

This runs app on http://localhost:3000/. Then, you select the model and use the application.

Using Open Web UI I prompted Phi 4 to create the web app. The prompt I used is shown below.

Create a HTML and JavaScript text comparison web page. Use an open source library do the text diff. Use Tailwind CSS for styling. Include:

- Two textareas placed side by side (horizontally) for input texts. Make sure they use 50 rows each.
- The main content should take 80% of the screen
- A 'Compare' button that shows text differences
- A 'Reset' button to clear all fields
- Compare and Reset buttons should be in the same row with space
- A result area that displays a proper diff view:
    - Green background for added text
    - Red background for removed text
    - Unchanged text in normal color
- Word count display for each textarea
- Make the UI modern and clean using Tailwind classes
- Ensure good spacing and padding between elements
- Make textareas equal width and height

Keep the code minimal and use vanilla JavaScript. Focus on basic functionality. Example Tailwind classes to consider: flex, gap, p-4, rounded-lg, etc.

It generated following HTML.

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Text Comparison Tool</title>
    <!-- Tailwind CSS -->
    <link href="https://cdn.jsdelivr.net/npm/tailwindcss@3.2.0/dist/tailwind.min.css" rel="stylesheet">
    <style>
        .diff-added {
            background-color: #d4edda;
        }

        .diff-removed {
            background-color: #f8d7da;
        }
    </style>
</head>

<body class="bg-gray-100 flex items-center justify-center min-h-screen">
    <div class="w-full max-w-6xl p-8 bg-white shadow-lg rounded-lg">
        <h1 class="text-2xl font-bold text-center mb-8">Text Comparison Tool</h1>

        <!-- Input Textareas -->
        <div class="flex flex-col gap-4 mb-8">
            <textarea id="text1" rows="50" placeholder="Enter first text here..."
                class="p-3 border rounded-lg w-full"></textarea>
            <p>Word Count: <span id="wordCount1">0</span></p>

            <textarea id="text2" rows="50" placeholder="Enter second text here..."
                class="p-3 border rounded-lg w-full mt-4"></textarea>
            <p>Word Count: <span id="wordCount2">0</span></p>
        </div>

        <!-- Buttons -->
        <div class="flex justify-between items-center mb-8">
            <button id="compareButton"
                class="px-5 py-2 bg-blue-500 text-white rounded-lg hover:bg-blue-700 transition duration-300">Compare</button>
            <button id="resetButton"
                class="px-5 py-2 bg-red-500 text-white rounded-lg hover:bg-red-700 transition duration-300">Reset</button>
        </div>

        <!-- Result Area -->
        <div id="resultArea" class="p-4 border rounded-lg"></div>
    </div>

    <!-- diff-match-patch Library -->
    <script src="https://cdnjs.cloudflare.com/ajax/libs/diff-match-patch/20200319/dmp.js"></script>

    <!-- JavaScript for functionality -->
    <script>
        const dmp = new diff_match_patch();

        const text1 = document.getElementById('text1');
        const text2 = document.getElementById('text2');
        const compareButton = document.getElementById('compareButton');
        const resetButton = document.getElementById('resetButton');
        const resultArea = document.getElementById('resultArea');

        // Word Count Display
        function updateWordCount() {
            document.getElementById('wordCount1').innerText = text1.value.split(/\s+/).filter(Boolean).length;
            document.getElementById('wordCount2').innerText = text2.value.split(/\s+/).filter(Boolean).length;
        }

        text1.addEventListener('input', () => updateWordCount());
        text2.addEventListener('input', () => updateWordCount());

        // Compare Button Functionality
        compareButton.addEventListener('click', () => {
            const diffs = dmp.diff_main(text1.value, text2.value);
            dmp.diff_cleanupSemantic(diffs);

            resultArea.innerHTML = diffs.map(diff => {
                const className = diff[0] === 1 ? 'diff-added' : diff[0] === -1 ? 'diff-removed' : '';
                return `<span class="${className}">${diff[1]}</span>`;
            }).join('');
        });

        // Reset Button Functionality
        resetButton.addEventListener('click', () => {
            text1.value = '';
            text2.value = '';
            resultArea.innerHTML = '';
            updateWordCount();
        });
    </script>
</body>

</html>

The generated code had following issues:

  • UI was not aligned correctly. Text areas were stacked on top of each other instead of horizontally aligned as I instructed
  • Tailwind CSS was incorrectly referenced
  • diff-match-patch library was incorrectly referenced

After I changed following

<script src="https://cdn.tailwindcss.com"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/diff_match_patch/20121119/diff_match_patch.js"></script>

and

<!-- Input Textareas -->
<div class="flex w-full justify-between items-start gap-4">
    <textarea id="text1" rows="20" placeholder="Enter first text here..."
        class="w-1/2 h-64 p-4 border rounded-lg resize-none bg-white"></textarea>
    <p>Word Count: <span id="wordCount1">0</span></p>

    <textarea id="text2" rows="20" placeholder="Enter second text here..."
        class="w-1/2 h-64 p-4 border rounded-lg resize-none bg-white"></textarea>
    <p>Word Count: <span id="wordCount2">0</span></p>
</div>

I got the UI working as shown below.

I also tried the same prompt in ChatGPT and Claude to show a comparison. Below are their UI screenshots.

ChatGPT got diff library path wrong so I had to fix it. After fixing it, I got it working as shown below. It chose a different diff library https://cdn.jsdelivr.net/npm/diff.

With Claude I got following. It made zero mistakes. It just worked. Claude also used diff_match_patch library as used by Phi 4.


Discover more from Shekhar Gulati

Subscribe to get the latest posts sent to your email.

Leave a comment