1

I need to produce a PDF with Cyrillic text such that it could be copy-pasted to other files. I have the requirement to use the font style and font size set by \fontsize as provided in my MWE, see below, and pdflatex.

This is what I get when I copy the text:

Жирный текст Îáû÷íûé òåêñò 

Note that I can copy the text in bold, but cannot copy the text that follows. What should I do to produce a PDF such that I can copy Обычный текст?

MWE:

\documentclass[12pt,dvipsnames]{report} % https://www.overleaf.com/learn/latex/Russian \usepackage{iftex} \iftutex % For LuaTeX or XeTeX Use Google's % OpenType Noto fonts for typesetting % Russian text \usepackage{fontspec} \defaultfontfeatures{Ligatures={TeX}} \setmainfont{Noto Serif} \setsansfont{Noto Sans} \setmonofont{Noto Sans Mono} \else % For pdfTeX we must use old % 8-bit font technologies \usepackage[T2A]{fontenc} \fi %https://tex.stackexchange.com/questions/64188/what-are-good-ways-to-make-pdflatex-output-copy-and-pasteable \usepackage{cmap} \begin{document} \fontsize{12pt}{14.4pt} \selectfont \textbf{Жирный текст} Обычный текст \end{document} 

Here is the log file (generated with cmap commented out):

This is pdfTeX, Version 3.141592653-2.6-1.40.25 (MiKTeX 24.1) (preloaded format=pdflatex 2025.1.29) 26 MAR 2025 14:08 entering extended mode restricted \write18 enabled. %&-line parsing enabled. **./ru_test.tex (ru_test.tex LaTeX2e <2023-11-01> patch level 1 L3 programming layer <2024-01-04> (C:\Users\MiKTeX\tex/latex/base\report.cls Document Class: report 2023/05/17 v1.4n Standard LaTeX document class (C:\Users\MiKTeX\tex/latex/base\size12.clo File: size12.clo 2023/05/17 v1.4n Standard LaTeX file (size option) ) \c@part=\count187 \c@chapter=\count188 \c@section=\count189 \c@subsection=\count190 \c@subsubsection=\count191 \c@paragraph=\count192 \c@subparagraph=\count193 \c@figure=\count194 \c@table=\count195 \abovecaptionskip=\skip48 \belowcaptionskip=\skip49 \bibindent=\dimen140 ) (C:\Users\MiKTeX\tex/generic/iftex\iftex.sty Package: iftex 2022/02/03 v1.0f TeX engine tests ) (C:\Users\MiKTeX\tex/latex/base\fontenc.sty Package: fontenc 2021/04/29 v2.0v Standard LaTeX package (C:\Users\MiKTeX\tex/latex/cyrillic\t2aenc.def File: t2aenc.def 2023/11/07 v1.0k Cyrillic encoding definition file Now handling font encoding T2A ... ... processing UTF-8 mapping file for font encoding T2A (C:\Users\MiKTeX\tex/latex/base\t2aenc.dfu File: t2aenc.dfu 2022/06/07 v1.3c UTF-8 support defining Unicode char U+00A4 (decimal 164) defining Unicode char U+00A7 (decimal 167) defining Unicode char U+00AB (decimal 171) defining Unicode char U+00BB (decimal 187) defining Unicode char U+0131 (decimal 305) defining Unicode char U+0237 (decimal 567) defining Unicode char U+0400 (decimal 1024) defining Unicode char U+0401 (decimal 1025) defining Unicode char U+0402 (decimal 1026) defining Unicode char U+0403 (decimal 1027) defining Unicode char U+0404 (decimal 1028) defining Unicode char U+0405 (decimal 1029) defining Unicode char U+0406 (decimal 1030) defining Unicode char U+0407 (decimal 1031) defining Unicode char U+0408 (decimal 1032) defining Unicode char U+0409 (decimal 1033) defining Unicode char U+040A (decimal 1034) defining Unicode char U+040B (decimal 1035) defining Unicode char U+040C (decimal 1036) defining Unicode char U+040D (decimal 1037) defining Unicode char U+040E (decimal 1038) defining Unicode char U+040F (decimal 1039) defining Unicode char U+0410 (decimal 1040) defining Unicode char U+0411 (decimal 1041) defining Unicode char U+0412 (decimal 1042) defining Unicode char U+0413 (decimal 1043) defining Unicode char U+0414 (decimal 1044) defining Unicode char U+0415 (decimal 1045) defining Unicode char U+0416 (decimal 1046) defining Unicode char U+0417 (decimal 1047) defining Unicode char U+0418 (decimal 1048) defining Unicode char U+0419 (decimal 1049) defining Unicode char U+041A (decimal 1050) defining Unicode char U+041B (decimal 1051) defining Unicode char U+041C (decimal 1052) defining Unicode char U+041D (decimal 1053) defining Unicode char U+041E (decimal 1054) defining Unicode char U+041F (decimal 1055) defining Unicode char U+0420 (decimal 1056) defining Unicode char U+0421 (decimal 1057) defining Unicode char U+0422 (decimal 1058) defining Unicode char U+0423 (decimal 1059) defining Unicode char U+0424 (decimal 1060) defining Unicode char U+0425 (decimal 1061) defining Unicode char U+0426 (decimal 1062) defining Unicode char U+0427 (decimal 1063) defining Unicode char U+0428 (decimal 1064) defining Unicode char U+0429 (decimal 1065) defining Unicode char U+042A (decimal 1066) defining Unicode char U+042B (decimal 1067) defining Unicode char U+042C (decimal 1068) defining Unicode char U+042D (decimal 1069) defining Unicode char U+042E (decimal 1070) defining Unicode char U+042F (decimal 1071) defining Unicode char U+0430 (decimal 1072) defining Unicode char U+0431 (decimal 1073) defining Unicode char U+0432 (decimal 1074) defining Unicode char U+0433 (decimal 1075) defining Unicode char U+0434 (decimal 1076) defining Unicode char U+0435 (decimal 1077) defining Unicode char U+0436 (decimal 1078) defining Unicode char U+0437 (decimal 1079) defining Unicode char U+0438 (decimal 1080) defining Unicode char U+0439 (decimal 1081) defining Unicode char U+043A (decimal 1082) defining Unicode char U+043B (decimal 1083) defining Unicode char U+043C (decimal 1084) defining Unicode char U+043D (decimal 1085) defining Unicode char U+043E (decimal 1086) defining Unicode char U+043F (decimal 1087) defining Unicode char U+0440 (decimal 1088) defining Unicode char U+0441 (decimal 1089) defining Unicode char U+0442 (decimal 1090) defining Unicode char U+0443 (decimal 1091) defining Unicode char U+0444 (decimal 1092) defining Unicode char U+0445 (decimal 1093) defining Unicode char U+0446 (decimal 1094) defining Unicode char U+0447 (decimal 1095) defining Unicode char U+0448 (decimal 1096) defining Unicode char U+0449 (decimal 1097) defining Unicode char U+044A (decimal 1098) defining Unicode char U+044B (decimal 1099) defining Unicode char U+044C (decimal 1100) defining Unicode char U+044D (decimal 1101) defining Unicode char U+044E (decimal 1102) defining Unicode char U+044F (decimal 1103) defining Unicode char U+0450 (decimal 1104) defining Unicode char U+0451 (decimal 1105) defining Unicode char U+0452 (decimal 1106) defining Unicode char U+0453 (decimal 1107) defining Unicode char U+0454 (decimal 1108) defining Unicode char U+0455 (decimal 1109) defining Unicode char U+0456 (decimal 1110) defining Unicode char U+0457 (decimal 1111) defining Unicode char U+0458 (decimal 1112) defining Unicode char U+0459 (decimal 1113) defining Unicode char U+045A (decimal 1114) defining Unicode char U+045B (decimal 1115) defining Unicode char U+045C (decimal 1116) defining Unicode char U+045D (decimal 1117) defining Unicode char U+045E (decimal 1118) defining Unicode char U+045F (decimal 1119) defining Unicode char U+0490 (decimal 1168) defining Unicode char U+0491 (decimal 1169) defining Unicode char U+0492 (decimal 1170) defining Unicode char U+0493 (decimal 1171) defining Unicode char U+0496 (decimal 1174) defining Unicode char U+0497 (decimal 1175) defining Unicode char U+0498 (decimal 1176) defining Unicode char U+0499 (decimal 1177) defining Unicode char U+049A (decimal 1178) defining Unicode char U+049B (decimal 1179) defining Unicode char U+049C (decimal 1180) defining Unicode char U+049D (decimal 1181) defining Unicode char U+04A0 (decimal 1184) defining Unicode char U+04A1 (decimal 1185) defining Unicode char U+04A2 (decimal 1186) defining Unicode char U+04A3 (decimal 1187) defining Unicode char U+04A4 (decimal 1188) defining Unicode char U+04A5 (decimal 1189) defining Unicode char U+04AA (decimal 1194) defining Unicode char U+04AB (decimal 1195) defining Unicode char U+04AE (decimal 1198) defining Unicode char U+04AF (decimal 1199) defining Unicode char U+04B0 (decimal 1200) defining Unicode char U+04B1 (decimal 1201) defining Unicode char U+04B2 (decimal 1202) defining Unicode char U+04B3 (decimal 1203) defining Unicode char U+04B6 (decimal 1206) defining Unicode char U+04B7 (decimal 1207) defining Unicode char U+04B8 (decimal 1208) defining Unicode char U+04B9 (decimal 1209) defining Unicode char U+04BA (decimal 1210) defining Unicode char U+04BB (decimal 1211) defining Unicode char U+04C0 (decimal 1216) defining Unicode char U+04C1 (decimal 1217) defining Unicode char U+04C2 (decimal 1218) defining Unicode char U+04D0 (decimal 1232) defining Unicode char U+04D1 (decimal 1233) defining Unicode char U+04D2 (decimal 1234) defining Unicode char U+04D3 (decimal 1235) defining Unicode char U+04D4 (decimal 1236) defining Unicode char U+04D5 (decimal 1237) defining Unicode char U+04D6 (decimal 1238) defining Unicode char U+04D7 (decimal 1239) defining Unicode char U+04D8 (decimal 1240) defining Unicode char U+04D9 (decimal 1241) defining Unicode char U+04DA (decimal 1242) defining Unicode char U+04DB (decimal 1243) defining Unicode char U+04DC (decimal 1244) defining Unicode char U+04DD (decimal 1245) defining Unicode char U+04DE (decimal 1246) defining Unicode char U+04DF (decimal 1247) defining Unicode char U+04E2 (decimal 1250) defining Unicode char U+04E3 (decimal 1251) defining Unicode char U+04E4 (decimal 1252) defining Unicode char U+04E5 (decimal 1253) defining Unicode char U+04E6 (decimal 1254) defining Unicode char U+04E7 (decimal 1255) defining Unicode char U+04E8 (decimal 1256) defining Unicode char U+04E9 (decimal 1257) defining Unicode char U+04EC (decimal 1260) defining Unicode char U+04ED (decimal 1261) defining Unicode char U+04EE (decimal 1262) defining Unicode char U+04EF (decimal 1263) defining Unicode char U+04F0 (decimal 1264) defining Unicode char U+04F1 (decimal 1265) defining Unicode char U+04F2 (decimal 1266) defining Unicode char U+04F3 (decimal 1267) defining Unicode char U+04F4 (decimal 1268) defining Unicode char U+04F5 (decimal 1269) defining Unicode char U+04F8 (decimal 1272) defining Unicode char U+04F9 (decimal 1273) defining Unicode char U+200C (decimal 8204) defining Unicode char U+2013 (decimal 8211) defining Unicode char U+2014 (decimal 8212) defining Unicode char U+2018 (decimal 8216) defining Unicode char U+2019 (decimal 8217) defining Unicode char U+201C (decimal 8220) defining Unicode char U+201D (decimal 8221) defining Unicode char U+201E (decimal 8222) defining Unicode char U+2030 (decimal 8240) defining Unicode char U+2031 (decimal 8241) defining Unicode char U+2116 (decimal 8470) defining Unicode char U+2329 (decimal 9001) defining Unicode char U+3008 (decimal 12296) defining Unicode char U+232A (decimal 9002) defining Unicode char U+3009 (decimal 12297) defining Unicode char U+2423 (decimal 9251) defining Unicode char U+27E8 (decimal 10216) defining Unicode char U+27E9 (decimal 10217) defining Unicode char U+FB00 (decimal 64256) defining Unicode char U+FB01 (decimal 64257) defining Unicode char U+FB02 (decimal 64258) defining Unicode char U+FB03 (decimal 64259) defining Unicode char U+FB04 (decimal 64260) defining Unicode char U+FB05 (decimal 64261) defining Unicode char U+FB06 (decimal 64262) )) LaTeX Font Info: Trying to load font information for T2A+cmr on input line 1 12. (C:\Users\MiKTeX\tex/latex/cyrillic\t2acmr.fd File: t2acmr.fd 2001/08/11 v1.0a Computer Modern Cyrillic font definitions )) (C:\Users\MiKTeX\tex/latex/l3backend\l3backend-pdft ex.def File: l3backend-pdftex.def 2024-01-04 L3 backend support: PDF output (pdfTeX) \l__color_backend_stack_int=\count196 \l__pdf_internal_box=\box51 ) LaTeX Warning: Unused global option(s): [dvipsnames]. (ru_test.aux) \openout1 = `ru_test.aux'. LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for TS1/cmr/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. LaTeX Font Info: Checking defaults for T2A/cmr/m/n on input line 23. LaTeX Font Info: ... okay on input line 23. [1 {C:/Users/MiKTeX/fonts/map/pdftex/pdftex.map}] (ru_test.aux) *********** LaTeX2e <2023-11-01> patch level 1 L3 programming layer <2024-01-04> *********** ) Here is how much of TeX's memory you used: 1351 strings out of 474486 21668 string characters out of 5744227 1924542 words of memory out of 5000000 23699 multiletter control sequences out of 15000+600000 559988 words of font info for 39 fonts, out of 8000000 for 9000 1141 hyphenation exceptions out of 8191 42i,5n,51p,201b,204s stack positions out of 10000i,1000n,20000p,200000b,200000s <C:\Users\MiKTeX\fonts/pk/ljfour/lh/lh-t2a/dpi600\larm120 0.pk> <C:\Users\MiKTeX\fonts/pk/ljfour/lh/lh-t2a/dpi600\labx 1200.pk> Output written on ru_test.pdf (1 page, 9516 bytes). PDF statistics: 35 PDF objects out of 1000 (max. 8388607) 0 named destinations out of 1000 (max. 500000) 1 words of extra memory for PDF output out of 10000 (max. 10000000) 
11
  • 2
    is your tex system up-to-date? Commented Mar 25 at 20:19
  • Simple answer: don't use "old 8-bit font technologies" (citation from your comment). Commented Mar 25 at 20:25
  • @DavidCarlisle I only said that if we leave old technologies (i.e. pdftex) then we have less problems. For example, this question would not be here. And the comment in the presented code says clearly: this is old technology. We need to have special hacks in order to use such obscure encoding like T2A and we shouldn't be surprised when something doesn't work. The life without pdftex in TeX distributions would be simpler. It will probably take a few more years for others to figure it out, but eventually it will happen. In the meantime, many people will still encounter unnecessary problems. Commented Mar 26 at 5:51
  • @UlrikeFischer of course, my text system is up-to-date Commented Mar 26 at 9:43
  • 1
    @Antonio sorry but how can you believe that LaTeX2e <2023-11-01> patch level 1 is up-to-date? We have 2025. My miktex says This is pdfTeX, Version 3.141592653-2.6-1.40.27 (MiKTeX 25.3) and LaTeX2e <2025-06-01> pre-release-2 (develop 2025-3-25 branch) (and copy and paste works fine). Check for updates in user and admin mode. Commented Mar 26 at 11:17

2 Answers 2

1

In your example, if pdfLaTeX is used (i.e., the \else branch), the fontenc package is loaded with the [T2A] option, and then cmap is loaded afterward. The cmap package would prefer that the font encoding (via fontenc) be set up after it is loaded so that it can correctly "wrap" all fonts. If fontenc is already loaded, then cmap cannot process some fonts.

If cmap is loaded first, then copying from the PDF becomes correct for both pdfLaTeX and LuaLaTeX—in any case, for me everything copies perfectly.

However, for LuaLaTeX, the cmap package is actually not needed; here I agree with the previous answer.

3
  • Many thanks! This worked for me! Commented Mar 26 at 11:19
  • this answer isn't strictly wrong but should not be needed in any latex less than 2 years old, the mappings needed for Cyrillic are loaded by default and so cmap is not needed. Commented Mar 26 at 20:30
  • but it works and does the job! Commented Mar 27 at 15:11
4

If I delete the cmap package and run your document with pdflatex it cuts and pastes as

Жирный текст Обычный текст 

From lualatex generated pdf I get

Жирный текст Обычный текст 

which looks the same (and the same as your input) to me

3
  • If I delete the cmap package, I cannot copy correctly both lines of text Commented Mar 26 at 9:42
  • I use MiKTeX, last updated January 16, 2025 Commented Mar 26 at 11:07
  • Probably the MiKTeX distribution contained some outdated files. I downloaded and installed MiKTeX in 2025. Commented Mar 27 at 15:10

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.