Consider the following MWE:
\documentclass{article}\usepackage{lmodern}\usepackage[T1]{fontenc}\usepackage[utf8]{inputenc}\usepackage[croatian]{babel}\input{glyphtounicode} % *\pdfgentounicode=1 % **\begin{document}A B C ČĆ D DŽĐ \dots\end{document}
Copy/Paste from Adobe Reader gives A(0x41) B(0x42) C(0x43) Č(0xC48C) Ć(0xC486) D(0x44) DŽ(0x44 0xC5BD) Ð(0xC390) .(0x2E) .(0x2E) .(0x2E)
(UTF-8).
I am aware of the shortcomings of T1 in this MWE:
- not having separate slots for
ETH
andD WITH STROKE
, hence I getETH
(0xC390
): https://tex.stackexchange.com/a/569460/115879 - not having special slots for Croatian digraphs, hence they must be typeset as two separate characters (they are actually entered that way in the source code, using
DŽ
and others yields an error) - not having a special slot for the ellipsis, hence I get
FULL STOP
three times.
Which mappings (refer to the files in the distribution/package, please) will be used when:
*
is commented**
is commented- both
*
and**
are commented?
EDIT. Based on David Carlisle's comment, I change the question a bit: Can someone tell what happened e.g. five years ago when glyphtounicode
was not input automatically?
Thanks!