I think you are confusing two similar, but separate concepts.
Both the screen and each polygon in a scene have UVs, because the screen is just two triangles at the end of the day. You may have been reading about screen/viewport coordinartes and it’s UVs as if it’s a polygon, which is correct, but not quite the same as the UVs of a mesh in the scene. You are drawing TO the viewport, after all.
The way a pixel shader works is it scans every pixel (actually, fragment) in the display and for each one it fetches the appropriate color by computing the transforms of meshes and the scene etc. Basically, you can think about it like it’s shooting a ray from the camera plane towards the scene, and the first thing it hits, that’s what it renders (it’s a simplification).
When if hits a face, it finds WHERE on that face it hit, that’s the UV. This coordinate is then passed to the UV map to get the color from the texel (texture pixel, usually) that corresponds to that UV. This is the albedo color that will be used, normally. It can be modified by lights and reflections, etc, of course.
The point is the UVs of a face in the scene map to a texture. The mesh gets transformed by the model and view matrices. The final fragment on the vieport can then further be modified by treating it like a face itself. Usually you’re not gonna touch the vertex shader for normal rendering. That’s only if you need to add or modify vertex data (data about a vertex, like color) while rendering.
I hope this helps clear a bit of the confusion. It’s not easy to visualize all that goes on in a shader, but it’s easier if you understand each independent step as a single thing. And remember it all happens one per pixel (actually, per fragment, which can be more or less pixels depending on the device, but on regular screens, usually a pixel is a fragment).