Re: It bears repeating: programming exclusively in Java considered harmful.
C does not require pointers to 8-bit values. It requires sizeof(char) == 1 and that char has at least 8 bits. TMS320C40 has pointers to 16-bit values. I programmed it just fine in C. You could have pointers to 32-bit values and C would still work just fine. Programs written in C and many other languages assume pointers point at 8 bits. That causes those programs to crash on exotic architectures where the assumption is not true. It is not a fault of the language. It is either a decision taken by the programmer to support only the most common hardware or (far more likely) the programmer had no idea that pointers could point at anything other than bytes.
If you create a new architecture where pointers point at things bigger than bytes, large amounts of software will not work on it without some programmer going through the source code and fixing every part that assumes pointers point at bytes. This will not just hurt programs written in C/C++. The software I wrote for TMS320C40 had a small quantity of ASCII strings. These wasted one byte per character because the extra code required to implement pointers to bytes would have been bigger than the potential saving. Build a bigger general purpose CPU with 32-bit chars and you may save on pointer size but now byte streams cost four times as much memory or a huge performance hit from emulating pointer to bytes in software (while bringing back either 64-bit pointers or 4GB address space.)
No I did not notice memory memory requirements stabilized at about 4GB. I typically work on the small size. My largest machine is 2GB with most having considerably less. On this site you will find an unusually high proportion of people who would have problems being limited to a 32GB address space. Quadrupling the size of all bytes streams would increase memory requirements for many users, not just the extremes who are over-represented here.
Garbage collection is a serious problem for me as it causes programs not to run in a deterministic amount of time. One of the great benefits of C is it does not inflict garbage collection on me unless I choose to use a library that provides garbage collected objects.
The OS kernel (written in C) could map blocks of memory to the same address to support dynamic type tags inside pointers. It would thrash the memory translation caches, but those could be increased in size at considerable expense of transistor count. C would have no serious problem extracting and comparing a type encoded into pointers. Your pointer type fields inside pointers could be implemented right now in software with existing hardware. Go off and implement it and we will see if your plan provides real benefits over storing dynamic type in the object.
It is extremely possible to implement modern efficient garbage collection in C or any Turing complete language. (Python's garbage collector is implemented in C).
Modular arithmetic is an option selected by programmers (or selected for them by default). If you want overflow detection the option for gcc is -ftrapv.