Let's start with this because it's the easiest: no, every client would not need to create a new connection for every request. All good HTTP implementions aggressively re-use connections whenever possible, even in HTTP/1.1. Far and away the best way to solve this problem for you is to get out of the business of doing custom networking, and use HTTP.
With that said, let's tackle this question:
At a fundamental level, multiplexing is a way to send multiple parallel streams of data inside a single meta-stream of data. This is a really very simple idea. The simplest way to achieve multiplexing is to use a simple TLV (type-length-value) pattern for structuring your data.
For example, imagine you want to be able to send n parallel streams of data at once. You decide to identify each stream with a unique number, starting from 0 and going upwards. How would we encode this?
The simplest way might look something like this:
+---------------------------------------------------------------+
| Stream Identifier (32) |
+---------------------------------------------------------------+
| Length (32) |
+---------------------------------------------------------------+
| Payload (0...) ...
+---------------------------------------------------------------+
In this case, the data in each stream is chunked up into chunks no larger than 2^32 bytes in size, tagged with the stream it belongs to, and then sent. This gives the receiver enough information to work out where each chunk begins and ends, and then to "demultiplex" the payload into a series of parallel streams.
It might seem too simple, but this is really all that multiplexing is. In fact, that format above was taken from the HTTP/2 "frame format" section from the (now-obsolete) original HTTP/2 specification, with some of the unnecessary stuff removed.
However, an important note about multiplexing is that it does not inherently solve the problem you described in your original post:
This problem (called "head of line blocking") is not solved by multiplexing. Multiplexing reduces it, because you no longer need to wait for an entire message to be sent before you can handle a different one, but it doesn't eliminate it. You get further by introducing more complex concepts, like flow control and priorities. But even then, the shared TCP connection is a single bottleneck and if it gets stuck, you cannot make forward progress at all.
All of this is to say, there's a reason that HTTP/2 has been supplemented by HTTP/3: HTTP/3 is able to finally remove the one remaining head-of-line blocking problem that HTTP/2 could not.