Adobe 22002484 User Guide - Page 68

Scanning tips, Recognize text in scanned documents

Page 68 highlights

USING ACROBAT 9 STANDARD 63 Creating PDFs RGB input. Select Off when scanning a page with no pictures or filled areas, or when scanning at a resolution higher than the effective range. Halo Removal When On (recommended), removes excess color at high-contrast edges, which may have been introduced during either printing or scanning. This filter is used only on color input pages. Scanning tips • Acrobat scanning accepts images between 10 dpi and 3000 dpi. If you select Searchable Image or ClearScan for PDF Output Style, input resolution of 72 dpi or higher is required, and input resolution higher than 600 dpi is downsampled to 600 dpi or lower. • To apply lossless compression to a scanned image, select one of these options under the Compression section in the Optimization Options dialog box: CCITT Group 4 for monochrome images, or Lossless for color or grayscale images. If this image is appended to a PDF document, and the file is saved by Save, the scanned image remains uncompressed. If the PDF document is saved using Save As, the scanned image may be compressed. • For most pages, black-and-white scanning at 300 dpi produces text best suited for conversion. At 150 dpi, OCR accuracy is slightly lower, and more font-recognition errors occur; at 400 dpi and higher resolution, processing slows and compressed pages are bigger. If a page has many unrecognized words or very small text (9 points or smaller), try scanning at higher resolution. Scan in black and white whenever possible. • When Recognize Text Using OCR is disabled, full 10-to-3000 dpi resolution range may be used, but the recommended resolution is 72 and higher dpi. For Adaptive compression, 300 dpi is recommended for grayscale or RGB input, or 600 dpi for black-and-white input. • Pages scanned in 24-bit color, 300 dpi, at 8-1/2-by-11 in. (21.59-by-27.94 cm) result in large images (25 MB) before compression. Your system may require 50 MB of virtual memory or more to scan the image. At 600 dpi, both scanning and processing typically are about four times slower than at 300 dpi. • Avoid dithering or halftone scanner settings. These settings can improve the appearance of photographs, but they make it difficult to recognize text. • For text printed on colored paper, try increasing the brightness and contrast by about 10%. If your scanner has color-filtering capability, consider using a filter or lamp that drops out the background color. Or if the text isn't crisp or drops out, try adjusting scanner contrast and brightness to clarify the scan. • If your scanner has a manual brightness control, adjust it so that characters are clean and well formed. If characters are touching, use a higher (brighter) setting. If characters are separated, use a lower (darker) setting. Recognize text in scanned documents You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF. Optical character recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF. To apply OCR to a PDF, the original scanner resolution must have been set at 72 dpi or higher. Note: Scanning at 300 dpi produces the best text for conversion. At 150 dpi, OCR accuracy is slightly lower. More Help topics "Adding unifying page elements" on page 109 Recognize text in a single document 1 Open the scanned PDF. 2 Choose Document > OCR Text Recognition > Recognize Text Using OCR. Last updated 9/30/2011

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355
  • 356
  • 357
  • 358
  • 359
  • 360
  • 361
  • 362
  • 363
  • 364
  • 365
  • 366
  • 367
  • 368
  • 369
  • 370
  • 371
  • 372
  • 373
  • 374
  • 375
  • 376
  • 377
  • 378
  • 379
  • 380

63
USING ACROBAT 9 STANDARD
Creating PDFs
Last updated
9
/30/2011
RGB input. Select Off when scanning a page with no pictures or filled areas, or when scanning at a resolution higher
than the effective range.
Halo Removal
When On (recommended), removes excess color at high-contrast edges, which may have been
introduced during either printing or scanning. This filter is used only on color input pages.
Scanning tips
Acrobat scanning accepts images between 10 dpi and 3000 dpi. If you select Searchable Image or ClearScan for PDF
Output Style, input resolution of 72 dpi or higher is required, and input resolution higher than 600 dpi is
downsampled to 600 dpi or lower.
To apply lossless compression to a scanned image, select one of these options under the Compression section in the
Optimization Options dialog box: CCITT Group 4 for monochrome images, or Lossless for color or grayscale
images. If this image is appended to a PDF document, and the file is saved by Save, the scanned image remains
uncompressed. If the PDF document is saved using Save As, the scanned image may be compressed.
For most pages, black-and-white scanning at 300 dpi produces text best suited for conversion. At 150 dpi, OCR
accuracy is slightly lower, and more font-recognition errors occur; at 400 dpi and higher resolution, processing
slows and compressed pages are bigger. If a page has many unrecognized words or very small text (9 points or
smaller), try scanning at higher resolution. Scan in black and white whenever possible.
When Recognize Text Using OCR is disabled, full 10-to-3000 dpi resolution range may be used, but the
recommended resolution is 72 and higher dpi. For Adaptive compression, 300 dpi is recommended for grayscale
or RGB input, or 600 dpi for black-and-white input.
Pages scanned in 24-bit color, 300 dpi, at 8-1/2–by-11 in. (21.59-by-27.94 cm) result in large images (25 MB) before
compression. Your system may require 50 MB of virtual memory or more to scan the image. At 600 dpi, both
scanning and processing typically are about four times slower than at 300 dpi.
Avoid dithering or halftone scanner settings. These settings can improve the appearance of photographs, but they
make it difficult to recognize text.
For text printed on colored paper, try increasing the brightness and contrast by about 10%. If your scanner has
color-filtering capability, consider using a filter or lamp that drops out the background color. Or if the text isn’t
crisp or drops out, try adjusting scanner contrast and brightness to clarify the scan.
If your scanner has a manual brightness control, adjust it so that characters are clean and well formed. If characters
are touching, use a higher (brighter) setting. If characters are separated, use a lower (darker) setting.
Recognize text in scanned documents
You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF.
Optical character recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF. To
apply OCR to a PDF, the original scanner resolution must have been set at 72 dpi or higher.
Note:
Scanning at 300 dpi produces the best text for conversion. At 150 dpi, OCR accuracy is slightly lower.
More Help topics
Adding unifying page elements
” on page
109
Recognize text in a single document
1
Open the scanned PDF.
2
Choose Document > OCR Text Recognition > Recognize Text Using OCR.