So, the áccent thing, is acoustically different from Mandarin tone. So, using the digest example:
a. dígest (such as a compilation of summaries)
b. digést (such as a creature processing food to extract nutrition from it)
The (a) one is usually realised as /ˈdaɪdʒɛst/, while the (b) one as /dəˈdʒɛst/. So not only does the first syllable in (a) have a different vowel than in (b), but first syllable in (b) will have a drastically shorter duration than the first syllable of (a). These acoustic correlates in English are very different from what occurs with tone in Mandarin, which doesn't affect syllable duration or vowel quality in the same fashion. You can visualise the acoustic wave-forms of the accent-thing in English and compare it against the acoustic wave-forms of the tone-thing in Mandarin and see that they involve different acoustic properties. (So no need for any linguistic theory etc.)