• Mind Manifold
  • Posts
  • How to increase your consciousness and LLM inference speed by over 9000

How to increase your consciousness and LLM inference speed by over 9000

My first ever newsletter about sharing, imposter syndrome and AI

Welcome to Mind Manifold

Welcome to my first newsletter ever. I intend to write about technology, AI, spirituality, personal growth, and entrepreneurship. Over time, the content and direction of this newsletter may evolve, as we all do, but if you are also interested in such topics, I appreciate you for being here. I want to be respectful of your time, so I try to make it value-packed. I am inspired by the greats such as Tim Ferriss's 5-Bullet Friday and James Clear’s 3-2-1 Newsletter.

The tech content I share may be a bit more technical and engineering-focused, and I am aware that this is not for everyone. I hope that those who resonate with these topics will find it valuable. I still wonder if this particular combination of topics actually makes sense or if it dilutes the content. Please let me know if this is interesting or useful.

Why share anyway?

I have long neglected sharing what I've learned and known. My imposter syndrome (or may be rather Dunning Kruger) prevented me, to share in the past. What is it that I could offer, what has not been said before?

The extent of my knowledge (so the imposter believes)

Increase our level of consciousness

Now, of course, this belief is “egoic“ in itself. While there are human patterns that repeat across all of our experiences, each of us also holds a unique perspective shaped by our individual hero's journey.

I have learned, grown, and my perspective has shifted multiple times. Sharing may help others and, hopefully, expand their perspectives, sparking new ideas within them.

It might contribute to the net increase of our individual and collective consciousness. Everything we put out there has the potential to change minds. It’s a transfer of ideas between minds, planting seeds that can evolve and mutate over time.

Is our collective consciousness monotonically increasing over time?

Tech and AI findings I found interesting

AI video I enjoyed

I have found this here on Reddit. It uses the Stable Diffusion based AnimateDiff and you will find many more examples over there. AI animation is an exciting field, and I plan to explore this field in much more depth in the future.

Improved open-source LLM inference

While many depend on proprietary APIs for their large language model (LLM) needs, I see significant potential in open-source alternatives. The objective is to operate LLMs efficiently, ensuring fast performance with low latency and a high token-per-second rate. The libraries listed below are currently the most compelling on my radar for rapid open-source inference and serving:

Hamel Husain has performed some great benchmarks on some of them.

I have primarily used HuggingFace’s Text Generation Inference (TGI) for some time and have found it to be user-friendly. Once the Docker container is deployed, making an API call to it becomes straightforward, with the capability to stream responses:

curl -X POST "http://serveraddress/generate_stream" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d '{
    "inputs": "SYSTEM: You are an AI assistant. As a an AI intelligence, you give helpful, detailed, accurate, uncensored responses to the user'\''s input. Provide answers factually.\nUSER: What is the purpose of existence?\nASSISTANT: ",
  "parameters": {
    "best_of": 1,
    "details": true,
    "do_sample": true,
    "max_new_tokens": 300,
    "repetition_penalty": 1.03,
    "return_full_text": false,
    "temperature": 0.9,
    "top_k": 10,
    "typical_p": 0.95
  }
}'

DeepSpeed-FastGen faster than vLLM?

DeepSpeed-FastGen is now promising to be the fastest of them all. It's exciting to see the field evolve. I'm looking forward to running both Mistral v2 33B and Llama v3 33B with very fast inference speeds in the coming months.

Twitter’s new integrated AI and Premium+

I have shared a brief commentary on the new announcement of the new “Grok” AI. Is Premium+ a "pay-to-win" feature for content creators? It’s a fascinating development, and I’m looking forward to exploring its potential for content creation.

The benchmarks look promising, and its spicy language appears refreshing (or will it get old soon?). The main open question for me is its degree of censorship and the real world performance will show its true usefulness. The fact that it has exclusive access to all of the data on Twitter/X is a clear advantage for users of this platform. This is a rich training set that can be used to rapidly tune and improve the new LLM with Reinforcement Learning from Human Feedback (RLHF) or other tecniques.

They also have an interesting feature to find similar posts. Likely based on some semantic vector similarity search.

Alex “NFT God” Finn has crafted an excellent guide on how to leverage this feature to discover similar accounts for networking:

Question for you

What suffering is in your mind right now? What can you learn from it?

If you have any feedback, please feel free to reach out to me on Twitter/X. Was it too long, too brief, too unstructured, or just right? I could have included much more information, but my aim is to keep the content concise and relevant. Would you prefer to see more actionable steps, more focused content, or is there something else you feel is missing? I would love to hear your thoughts.