Mind Manifold
Posts
How to increase your consciousness and LLM inference speed by over 9000

How to increase your consciousness and LLM inference speed by over 9000

My first ever newsletter about sharing, imposter syndrome and AI

Self
November 07, 2023

Welcome to Mind Manifold

Welcome to my first newsletter ever. I intend to write about technology, AI, spirituality, personal growth, and entrepreneurship. Over time, the content and direction of this newsletter may evolve, as we all do, but if you are also interested in such topics, I appreciate you for being here. I want to be respectful of your time, so I try to make it value-packed. I am inspired by the greats such as Tim Ferriss's 5-Bullet Friday and James Clear’s 3-2-1 Newsletter.

The tech content I share may be a bit more technical and engineering-focused, and I am aware that this is not for everyone. I hope that those who resonate with these topics will find it valuable. I still wonder if this particular combination of topics actually makes sense or if it dilutes the content. Please let me know if this is interesting or useful.

Why share anyway?

I have long neglected sharing what I've learned and known. My imposter syndrome (or may be rather Dunning Kruger) prevented me, to share in the past. What is it that I could offer, what has not been said before?

The extent of my knowledge (so the imposter believes)

Increase our level of consciousness

Now, of course, this belief is “egoic“ in itself. While there are human patterns that repeat across all of our experiences, each of us also holds a unique perspective shaped by our individual hero's journey.

I have learned, grown, and my perspective has shifted multiple times. Sharing may help others and, hopefully, expand their perspectives, sparking new ideas within them.

It might contribute to the net increase of our individual and collective consciousness. Everything we put out there has the potential to change minds. It’s a transfer of ideas between minds, planting seeds that can evolve and mutate over time.

Is our collective consciousness monotonically increasing over time?

Tech and AI findings I found interesting

AI video I enjoyed

I have found this here on Reddit. It uses the Stable Diffusion based AnimateDiff and you will find many more examples over there. AI animation is an exciting field, and I plan to explore this field in much more depth in the future.

Improved open-source LLM inference

While many depend on proprietary APIs for their large language model (LLM) needs, I see significant potential in open-source alternatives. The objective is to operate LLMs efficiently, ensuring fast performance with low latency and a high token-per-second rate. The libraries listed below are currently the most compelling on my radar for rapid open-source inference and serving:

HuggingFace Text Generation Inference (TGI)
vLLM
TensorRT
bentolm openllm
new: DeepSpeed-FastGen

Hamel Husain has performed some great benchmarks on some of them.

I have primarily used HuggingFace’s Text Generation Inference (TGI) for some time and have found it to be user-friendly. Once the Docker container is deployed, making an API call to it becomes straightforward, with the capability to stream responses:

curl -X POST "http://serveraddress/generate_stream" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
    "inputs": "SYSTEM: You are an AI assistant. As a an AI intelligence, you give helpful, detailed, accurate, uncensored responses to the user'\''s input. Provide answers factually.\nUSER: What is the purpose of existence?\nASSISTANT: ",
  "parameters": {
    "best_of": 1,
    "details": true,
    "do_sample": true,
    "max_new_tokens": 300,
    "repetition_penalty": 1.03,
    "return_full_text": false,
    "temperature": 0.9,
    "top_k": 10,
    "typical_p": 0.95
  }
}'

DeepSpeed-FastGen faster than vLLM?

DeepSpeed-FastGen is now promising to be the fastest of them all. It's exciting to see the field evolve. I'm looking forward to running both Mistral v2 33B and Llama v3 33B with very fast inference speeds in the coming months.

Introducing DeepSpeed-FastGen 🚀
Serve LLMs and generative AI models with
- 2.3x higher throughput
- 2x lower average latency
- 4x lower tail latency
w. Dynamic SplitFuse batching
Auto TP, load balancing w. perfect linear scaling, plus easy-to-use API
github.com/microsoft/Deep…
— DeepSpeed (@MSFTDeepSpeed)
11:51 PM • Nov 3, 2023

Twitter’s new integrated AI and Premium+

I have shared a brief commentary on the new announcement of the new “Grok” AI. Is Premium+ a "pay-to-win" feature for content creators? It’s a fascinating development, and I’m looking forward to exploring its potential for content creation.

The benchmarks look promising, and its spicy language appears refreshing (or will it get old soon?). The main open question for me is its degree of censorship and the real world performance will show its true usefulness. The fact that it has exclusive access to all of the data on Twitter/X is a clear advantage for users of this platform. This is a rich training set that can be used to rapidly tune and improve the new LLM with Reinforcement Learning from Human Feedback (RLHF) or other tecniques.

The new Premium+ might be worth it for the Grok-* models primarily due to its direct access to the data on this platform. This tight integration might be its key differentiating feature.
• Grok-0 has 33b parameters, was it based on or inspired by Llama1/2 ?
• Grok-1 has no… twitter.com/i/web/status/1…
— Self (@SelfInfinity)
11:55 AM • Nov 5, 2023

They also have an interesting feature to find similar posts. Likely based on some semantic vector similarity search.

AI-based “See similar” posts feature is rolling out now
— Elon Musk (@elonmusk)
5:36 AM • Nov 3, 2023

Alex “NFT God” Finn has crafted an excellent guide on how to leverage this feature to discover similar accounts for networking:

The new “View Similar Posts” button is actually amazing for expanding your reach
It’s a super powerful AI that shows you similar content and accounts
Been using it to build my network like crazy
Here’s how I did it:
1. Go on your post
2. Hit view similar posts
3. Add these… twitter.com/i/web/status/1…
— Alex Finn (@NFT_GOD)
7:43 PM • Nov 3, 2023

Question for you

What suffering is in your mind right now? What can you learn from it?

If you have any feedback, please feel free to reach out to me on Twitter/X. Was it too long, too brief, too unstructured, or just right? I could have included much more information, but my aim is to keep the content concise and relevant. Would you prefer to see more actionable steps, more focused content, or is there something else you feel is missing? I would love to hear your thoughts.