Okay - it's confusing because what you're using accents to indicate is usually referred to as 'stress' in English, and what you're using all-caps for is usually 'emphasis' or 'focus prosody' or the like. I'm very interested in the ALLCAPS phenomenon, but I think it's largely irrelevant here (other than being acoustically similar to Mandarin tone).
So, the áccent thing, is acoustically different from Mandarin tone. So, using the digest example:
a. dígest (such as a compilation of summaries)
b. digést (such as a creature processing food to extract nutrition from it)
The (a) one is usually realised as /ˈdaɪdʒɛst/, while the (b) one as /dəˈdʒɛst/. So not only does the first syllable in (a) have a different vowel than in (b), but first syllable in (b) will have a drastically shorter duration than the first syllable of (a). These acoustic correlates in English are very different from what occurs with tone in Mandarin, which doesn't affect syllable duration or vowel quality in the same fashion. You can visualise the acoustic wave-forms of the accent-thing in English and compare it against the acoustic wave-forms of the tone-thing in Mandarin and see that they involve different acoustic properties. (So no need for any linguistic theory etc.)