Yes, there is a performance hit but how large it is depends a lot on what you're doing.
"Switching" is one of the most costly operations and in Kernel mode you do not need to do it unless interacting with something in user space.. which you would only do because something in Userspace requested it somehow.
For other things, such as virtual memory, Microsoft found that the protections needed for virtual memory could be anywhere between 10 and 20%; but since there's no concept of virtual memory in kernel space: it's hard to say concretely that your program would be "20% faster". It would be too much of a different program.
If you run dpdk you get raw ethernet frames and run TCP yourself. This means that a program can receive any data sent to the machine. More sophisticate cards can do routing in hardware and present multiple "virtual" cards, but this is not yet commonly available in consumer hardware.
It's basically the kernel giving up on trying to do anything with the hardware, thus it's not available to any other process except the one that takes the hardware.
To have general purpose networking in user space you will end up with some other IPC which does not rely on sockets (because sockets are kernel) or shared memory which is dangerous as hell.