Página 1 dos resultados de 311 itens digitais encontrados em 0.006 segundos

Camera reading for blind people

Neto, Roberto Ferreira
Fonte: Instituto Politécnico de Leiria Publicador: Instituto Politécnico de Leiria
Tipo: Dissertação de Mestrado
Publicado em //2014 POR
Relevância na Pesquisa
27.57%
Dissertação apresentado à Escola Superior de Tecnologia e Gestão do IPL para obtenção do grau de Mestre em Engenharia Informática - Computação Móvel, orientada pelo Doutor Nuno Fonseca.; A ausência de visão torna a vida de um invisual bastante difícil. No entanto, a utilização da tecnologia pode melhorar ligeiramente pequenos aspectos do dia-a-dia. Nesse contexto, o trabalho que se apresenta tem como objetivo a descrição do processo de desenvolvimento de uma aplicação para cegos. O projeto chama-se Camera Blind For Blind People, e tem como propósito final o desenvolvimento de uma aplicação que permita a um utilizador cego utilizar um dispositivo móvel para obter a leitura de um texto que esteja escrito numa folha de papel, num sinal, numa parede ou noutro suporte. Pretende-se com este projeto conceber o protótipo de uma aplicação para iOS, construída a partir da utilização conjunta e integrada de frameworks de reconhecimento óptico de caracteres (OCR) e de frameworks de sintetização de voz (TTS), que possibilite ao utilizador, recorrendo à câmara de um dispositivo móvel, captar uma imagem e obter a leitura do texto que exista nessa mesma imagem. O processo de reconhecimento do texto através do OCR será optimizado através da aplicação de filtros nas imagens captadas...

The Use of Latent Semantic Indexing to Mitigate OCR Effects of Related Document Images

BULCAO-NETO, Renato F.; CAMACHO-GUERRERO, Jose A.; DUTRA, Marcio; BARREIRO, Alvaro; PARAPAR, Javier; MACEDO, Alessandra A.
Fonte: GRAZ UNIV TECHNOLGOY, INST INFORMATION SYSTEMS COMPUTER MEDIA-IICM Publicador: GRAZ UNIV TECHNOLGOY, INST INFORMATION SYSTEMS COMPUTER MEDIA-IICM
Tipo: Artigo de Revista Científica
ENG
Relevância na Pesquisa
27.45%
Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.; CNPq[557976/2008-1]; FAPESP[05/60038-5]; FAPESP[05/60729-8]; FAPESP[06/58984-2]; FAPESP[09/14292-8]; FAPESP[2009/05504-1]; Spanish Ministerio de Ciencia e Innovacion[TIN2008-06566-C04-04]; FEDER; Xunta de Galicia[07SIN005206PR]; Innolution Sistemas de Informatica

Proposta de arquitetura de um sistema com base em OCR neuronal para resgate e indexação de escritas paleográficas do sec. XVI ao XIX; Proposal of an system architeture based on neural OCR for rescue and index paleography writens between XVI and XIX centuries

Mendonça, Fábio Lúcio Lopes de
Fonte: Universidade de Brasília Publicador: Universidade de Brasília
Tipo: Dissertação
POR
Relevância na Pesquisa
37.92%
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2008.; Este trabalho objetiva propor uma arquitetura de um sistema para tratamento e reconhecimento automático do texto de documentos paleográficos, utilizando um OCR (Optical Character Recognition) com tecnologia de redes neurais artificiais. O sistema proposto deve atuar no contexto de processos de transcrição do texto de documentos de escritas paleográficas do século XVI ao XIX, documentos estes do Brasil colônia que foram digitalizados a partir dos originais impressos arquivados no Arquivo Ultramarino de Lisboa, uma das realizações do Projeto Resgate do Ministério da Cultura brasileiro. A arquitetura do sistema proposto inclui módulos para segmentar as imagens digitalizadas dos documentos, para análise dos segmentos com OCR na tentativa de reconhecimento do texto, para treinamento do OCR com formação de um dicionário de palavras reconhecidas e para armazenamento do texto transcrito a partir das imagens dos documentos. Para avaliar essa arquitetura foi desenvolvido um protótipo de software que permite ao usuário segmentar manualmente uma imagem de documento, treinar um OCR simples e extrair com esse OCR algumas informações de texto do documento paleográfico digitalizado. Conclui-se que a arquitetura proposta é funcional...

Characterisation of the structure of ocr, the gene 0.3 protein of bacteriophage T7

Atanasiu, C.; Byron, O.; McMiken, H.; Sturrock, S. S.; Dryden, D. T. F.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Publicado em 15/07/2001 EN
Relevância na Pesquisa
27.66%
The product of gene 0.3 of bacteriophage T7, ocr, is a potent inhibitor of type I DNA restriction and modification enzymes. We have used biophysical methods to examine the mass, stability, shape and surface charge distribution of ocr. Ocr is a dimeric protein with hydrodynamic behaviour equivalent to a prolate ellipsoid of axial ratio 4.3 ± 0.7:1 and mass of 27 kDa. The protein is resistant to denaturation but removal of the C-terminal region reduces stability substantially. Six amino acids, N4, D25, N43, D62, S68 and W94, are all located on the surface of the protein and N4 and S68 are also located at the interface between the two 116 amino acid monomers. Negatively charged amino acid side chains surround W94 but these side chains are not part of the highly acidic C-terminus after W94. Ocr is able to displace a short DNA duplex from the binding site of a type I enzyme with a dissociation constant of the order of 100 pM or better. These results suggest that ocr is of a suitable size and shape to effectively block the DNA binding site of a type I enzyme and has a large negatively charged patch on its surface. This charge distribution may be complementary to the charge distribution within the DNA binding site of type I DNA restriction and modification enzymes.

Interaction of the ocr gene 0.3 protein of bacteriophage T7 with EcoKI restriction/modification enzyme

Atanasiu, C.; Su, T.-J.; Sturrock, S. S.; Dryden, D. T. F.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Publicado em 15/09/2002 EN
Relevância na Pesquisa
27.66%
The ocr protein, the product of gene 0.3 of bacteriophage T7, is a structural mimic of the phosphate backbone of B-form DNA. In total it mimics 22 phosphate groups over ∼24 bp of DNA. This mimicry allows it to block DNA binding by type I DNA restriction enzymes and to inhibit these enzymes. We have determined that multiple ocr dimers can bind stoichiometrically to the archetypal type I enzyme, EcoKI. One dimer binds to the core methyltransferase and two to the complete bifunctional restriction and modification enzyme. Ocr can also bind to the component subunits of EcoKI. Binding affinity to the methyltransferase core is extremely strong with a large favourable enthalpy change and an unfavourable entropy change. This strong interaction prevents the dissociation of the methyltransferase which occurs upon dilution of the enzyme. This stabilisation arises because the interaction appears to involve virtually the entire surface area of ocr and leads to the enzyme completely wrapping around ocr.

Dissection of the DNA Mimicry of the Bacteriophage T7 Ocr Protein using Chemical Modification

Stephanou, Augoustinos S.; Roberts, Gareth A.; Cooper, Laurie P.; Clarke, David J.; Thomson, Andrew R.; MacKay, C. Logan; Nutley, Margaret; Cooper, Alan; Dryden, David T.F.
Fonte: Elsevier Publicador: Elsevier
Tipo: Artigo de Revista Científica
Publicado em 21/08/2009 EN
Relevância na Pesquisa
27.82%
The homodimeric Ocr (overcome classical restriction) protein of bacteriophage T7 is a molecular mimic of double-stranded DNA and a highly effective competitive inhibitor of the bacterial type I restriction/modification system. The surface of Ocr is replete with acidic residues that mimic the phosphate backbone of DNA. In addition, Ocr also mimics the overall dimensions of a bent 24-bp DNA molecule. In this study, we attempted to delineate these two mechanisms of DNA mimicry by chemically modifying the negative charges on the Ocr surface. Our analysis reveals that removal of about 46% of the carboxylate groups per Ocr monomer results in an ∼ 50-fold reduction in binding affinity for a methyltransferase from a model type I restriction/modification system. The reduced affinity between Ocr with this degree of modification and the methyltransferase is comparable with the affinity of DNA for the methyltransferase. Additional modification to remove ∼ 86% of the carboxylate groups further reduces its binding affinity, although the modified Ocr still binds to the methyltransferase via a mechanism attributable to the shape mimicry of a bent DNA molecule. Our results show that the electrostatic mimicry of Ocr increases the binding affinity for its target enzyme by up to ∼ 800-fold.

A Functional Nuclear Localization Sequence in the C. elegans TRPV Channel OCR-2

Ezak, Meredith J.; Ferkey, Denise M.
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Publicado em 21/09/2011 EN
Relevância na Pesquisa
27.57%
The ability to modulate gene expression in response to sensory experience is critical to the normal development and function of the nervous system. Calcium is a key activator of the signal transduction cascades that mediate the process of translating a cellular stimulus into transcriptional changes. With the recent discovery that the mammalian Cav1.2 calcium channel can be cleaved, enter the nucleus and act as a transcription factor to control neuronal gene expression, a more direct role for the calcium channels themselves in regulating transcription has begun to be appreciated. Here we report the identification of a nuclear localization sequence (NLS) in the C. elegans transient receptor potential vanilloid (TRPV) cation channel OCR-2. TRPV channels have previously been implicated in transcriptional regulation of neuronal genes in the nematode, although the precise mechanism remains unclear. We show that the NLS in OCR-2 is functional, being able to direct nuclear accumulation of a synthetic cargo protein as well as the carboxy-terminal cytosolic tail of OCR-2 where it is endogenously found. Furthermore, we discovered that a carboxy-terminal portion of the full-length channel can localize to the nucleus of neuronal cells. These results suggest that the OCR-2 TRPV cation channel may have a direct nuclear function in neuronal cells that was not previously appreciated.

Exploring the DNA mimicry of the Ocr protein of phage T7

Roberts, Gareth A.; Stephanou, Augoustinos S.; Kanwar, Nisha; Dawson, Angela; Cooper, Laurie P.; Chen, Kai; Nutley, Margaret; Cooper, Alan; Blakely, Garry W.; Dryden, David T. F.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
27.66%
DNA mimic proteins have evolved to control DNA-binding proteins by competing with the target DNA for binding to the protein. The Ocr protein of bacteriophage T7 is the most studied DNA mimic and functions to block the DNA-binding groove of Type I DNA restriction/modification enzymes. This binding prevents the enzyme from cleaving invading phage DNA. Each 116 amino acid monomer of the Ocr dimer has an unusual amino acid composition with 34 negatively charged side chains but only 6 positively charged side chains. Extensive mutagenesis of the charges of Ocr revealed a regression of Ocr activity from wild-type activity to partial activity then to variants inactive in antirestriction but deleterious for cell viability and lastly to totally inactive variants with no deleterious effect on cell viability. Throughout the mutagenesis the Ocr mutant proteins retained their folding. Our results show that the extreme bias in charged amino acids is not necessary for antirestriction activity but that less charged variants can affect cell viability by leading to restriction proficient but modification deficient cell phenotypes.

Quantitative Computed Tomography (QCT) as a Radiology Reporting Tool by Using Optical Character Recognition (OCR) and Macro Program

Lee, Young Han; Song, Ho-Taek; Suh, Jin-Suck
Fonte: Springer-Verlag Publicador: Springer-Verlag
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
27.77%
The objectives are (1) to introduce a new concept of making a quantitative computed tomography (QCT) reporting system by using optical character recognition (OCR) and macro program and (2) to illustrate the practical usages of the QCT reporting system in radiology reading environment. This reporting system was created as a development tool by using an open-source OCR software and an open-source macro program. The main module was designed for OCR to report QCT images in radiology reading process. The principal processes are as follows: (1) to save a QCT report as a graphic file, (2) to recognize the characters from an image as a text, (3) to extract the T scores from the text, (4) to perform error correction, (5) to reformat the values into QCT radiology reporting template, and (6) to paste the reports into the electronic medical record (EMR) or picture archiving and communicating system (PACS). The accuracy test of OCR was performed on randomly selected QCTs. QCT as a radiology reporting tool successfully acted as OCR of QCT. The diagnosis of normal, osteopenia, or osteoporosis is also determined. Error correction of OCR is done with AutoHotkey-coded module. The results of T scores of femoral neck and lumbar vertebrae had an accuracy of 100 and 95.4 %...

Islet Oxygen Consumption Rate (OCR) Dose Predicts Insulin Independence in Clinical Islet Autotransplantation

Papas, Klearchos K.; Bellin, Melena D.; Sutherland, David E. R.; Suszynski, Thomas M.; Kitzmann, Jennifer P.; Avgoustiniatos, Efstathios S.; Gruessner, Angelika C.; Mueller, Kathryn R.; Beilman, Gregory J.; Balamurugan, Appakalai N.; Loganathan, Gopalakri
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
EN_US
Relevância na Pesquisa
27.9%
Background: Reliable in vitro islet quality assessment assays that can be performed routinely, prospectively, and are able to predict clinical transplant outcomes are needed. In this paper we present data on the utility of an assay based on cellular oxygen consumption rate (OCR) in predicting clinical islet autotransplant (IAT) insulin independence (II). IAT is an attractive model for evaluating characterization assays regarding their utility in predicting II due to an absence of confounding factors such as immune rejection and immunosuppressant toxicity. Methods: Membrane integrity staining (FDA/PI), OCR normalized to DNA (OCR/DNA), islet equivalent (IE) and OCR (viable IE) normalized to recipient body weight (IE dose and OCR dose), and OCR/DNA normalized to islet size index (ISI) were used to characterize autoislet preparations (n = 35). Correlation between pre-IAT islet product characteristics and II was determined using receiver operating characteristic analysis. Results: Preparations that resulted in II had significantly higher OCR dose and IE dose (p<0.001). These islet characterization methods were highly correlated with II at 6–12 months post-IAT (area-under-the-curve (AUC) = 0.94 for IE dose and 0.96 for OCR dose). FDA/PI (AUC = 0.49) and OCR/DNA (AUC = 0.58) did not correlate with II. OCR/DNA/ISI may have some utility in predicting outcome (AUC = 0.72). Conclusions: Commonly used assays to determine whether a clinical islet preparation is of high quality prior to transplantation are greatly lacking in sensitivity and specificity. While IE dose is highly predictive...

Estratégias para melhoria do desempenho de ferramentas comerciais de reconhecimento óptico de caracteres

Ferreira Alves, Neide; Dueire Lins, Rafael (Orientador)
Fonte: Universidade Federal de Pernambuco Publicador: Universidade Federal de Pernambuco
Tipo: Outros
PT_BR
Relevância na Pesquisa
27.57%
Para avaliar a qualidade do desempenho de ferramentas comerciais de Reconhecimento Óptico de Caracteres (OCR) é necessário adquirir métricas para avaliar o quanto um texto transcrito está próximo do texto original, uma vez que quando uma imagem sofre alterações, por menores que sejam, estas influenciam nas transcrições dos OCR s. Neste trabalho será apresentada uma nova métrica para avaliar transcrições de OCR s: através da aplicação de técnicas de filtragem (brilho, contraste, resolução, rotação, etc.) na imagem original, para que as mudanças mínimas gerem inúmeras imagens, as quais serão submetidas ao OCR e resultarão em textos distintos. Um algoritmo foi desenvolvido para comparar os textos gerados, analisando desde a quantidade de linhas até a igualdade entre os caracteres. Através da análise de maior freqüência entre os caracteres, este algoritmo gera um novo arquivo-texto. Com o uso desta metodologia, o arquivo gerado ficou muito próximo do original com um índice de acerto maior que os arquivos transcritos sem o processo de filtragem

Um estudo sobre reconhecimento visual de caracteres através de redes neurais

Osorio, Fernando Santos
Fonte: Universidade Federal do Rio Grande do Sul Publicador: Universidade Federal do Rio Grande do Sul
Tipo: Dissertação Formato: application/pdf
POR
Relevância na Pesquisa
27.66%
Este trabalho apresenta um estudo sabre reconhecimento visual de caracteres através da utilização das redes neurais. São abordados os assuntos referentes ao Processamento Digital de Imagens, aos sistemas de reconhecimento de caracteres, e as redes neurais. Ao final é apresentada uma proposta de implementação de um sistema OCR orientado ao reconhecimento de caracteres impressos, que utiliza uma rede neural desenvolvida especificamente para esta aplicação. O sistema proposto, que é denominado de sistema N2OCR, possui um protótipo implementado que também é descrito neste trabalho. Em relação ao Processamento Digital de Imagens são apresentados diversos temas, abrangendo os assuntos referentes à aquisição de imagens, ao tratamento das imagens e ao reconhecimento de padrões. A respeito da aquisição de imagens são destacados os aspectos referentes aos dispositivos de aquisição e os tipos de imagens obtidas através destes. Sobre o tratamento de imagens são abordados os aspectos referentes a imagens textuais, incluindo: halftoning, geração e modificação de histograma, limiarização e operações de filtragem. Quanto ao reconhecimento de padrões é feita uma breve análise das técnicas relacionadas a este tema. Os diversos tipos de sistemas de reconhecimento de caracteres são abordados...

Probabilistic Management of OCR Data using an RDBMS

Kumar, Arun; Ré, Christopher
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Relevância na Pesquisa
27.9%
The digitization of scanned forms and documents is changing the data sources that enterprises manage. To integrate these new data sources with enterprise data, the current state-of-the-art approach is to convert the images to ASCII text using optical character recognition (OCR) software and then to store the resulting ASCII text in a relational database. The OCR problem is challenging, and so the output of OCR often contains errors. In turn, queries on the output of OCR may fail to retrieve relevant answers. State-of-the-art OCR programs, e.g., the OCR powering Google Books, use a probabilistic model that captures many alternatives during the OCR process. Only when the results of OCR are stored in the database, do these approaches discard the uncertainty. In this work, we propose to retain the probabilistic models produced by OCR process in a relational database management system. A key technical challenge is that the probabilistic data produced by OCR software is very large (a single book blows up to 2GB from 400kB as ASCII). As a result, a baseline solution that integrates these models with an RDBMS is over 1000x slower versus standard text processing for single table select-project queries. However, many applications may have quality-performance needs that are in between these two extremes of ASCII and the complete model output by the OCR software. Thus...

A Complete Workflow for Development of Bangla OCR

Omee, Farjana Yeasmin; Himel, Shiam Shabbir; Bikas, Md. Abu Naser
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 05/04/2012
Relevância na Pesquisa
27.66%
Developing a Bangla OCR requires bunch of algorithm and methods. There were many effort went on for developing a Bangla OCR. But all of them failed to provide an error free Bangla OCR. Each of them has some lacking. We discussed about the problem scope of currently existing Bangla OCR's. In this paper, we present the basic steps required for developing a Bangla OCR and a complete workflow for development of a Bangla OCR with mentioning all the possible algorithms required.

Quality of OCR for Degraded Text Images

Hartley, Roger T.; Crumpton, Kathleen
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 05/02/1999
Relevância na Pesquisa
27.77%
Commercial OCR packages work best with high-quality scanned images. They often produce poor results when the image is degraded, either because the original itself was poor quality, or because of excessive photocopying. The ability to predict the word failure rate of OCR from a statistical analysis of the image can help in making decisions in the trade-off between the success rate of OCR and the cost of human correction of errors. This paper describes an investigation of OCR of degraded text images using a standard OCR engine (Adobe Capture). The documents were selected from those in the archive at Los Alamos National Laboratory. By introducing noise in a controlled manner into perfect documents, we show how the quality of OCR can be predicted from the nature of the noise. The preliminary results show that a simple noise model can give good prediction of the number of OCR errors.; Comment: 7 pages

OCR Context-Sensitive Error Correction Based on Google Web 1T 5-Gram Data Set

Bassil, Youssef; Alwani, Mohammad
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 01/04/2012
Relevância na Pesquisa
27.72%
Since the dawn of the computing era, information has been represented digitally so that it can be processed by electronic computers. Paper books and documents were abundant and widely being published at that time; and hence, there was a need to convert them into digital format. OCR, short for Optical Character Recognition was conceived to translate paper-based books into digital e-books. Regrettably, OCR systems are still erroneous and inaccurate as they produce misspellings in the recognized text, especially when the source document is of low printing quality. This paper proposes a post-processing OCR context-sensitive error correction method for detecting and correcting non-word and real-word OCR errors. The cornerstone of this proposed approach is the use of Google Web 1T 5-gram data set as a dictionary of words to spell-check OCR text. The Google data set incorporates a very large vocabulary and word statistics entirely reaped from the Internet, making it a reliable source to perform dictionary-based error correction. The core of the proposed solution is a combination of three algorithms: The error detection, candidate spellings generator, and error correction algorithms, which all exploit information extracted from Google Web 1T 5-gram data set. Experiments conducted on scanned images written in different languages showed a substantial improvement in the OCR error correction rate. As future developments...

OCR Post-Processing Error Correction Algorithm using Google Online Spelling Suggestion

Bassil, Youssef; Alwani, Mohammad
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 01/04/2012
Relevância na Pesquisa
27.72%
With the advent of digital optical scanners, a lot of paper-based books, textbooks, magazines, articles, and documents are being transformed into an electronic version that can be manipulated by a computer. For this purpose, OCR, short for Optical Character Recognition was developed to translate scanned graphical text into editable computer text. Unfortunately, OCR is still imperfect as it occasionally mis-recognizes letters and falsely identifies scanned text, leading to misspellings and linguistics errors in the OCR output text. This paper proposes a post-processing context-based error correction algorithm for detecting and correcting OCR non-word and real-word errors. The proposed algorithm is based on Google's online spelling suggestion which harnesses an internal database containing a huge collection of terms and word sequences gathered from all over the web, convenient to suggest possible replacements for words that have been misspelled during the OCR process. Experiments carried out revealed a significant improvement in OCR error correction rate. Future research can improve upon the proposed algorithm so much so that it can be parallelized and executed over multiprocessing platforms.; Comment: LACSC - Lebanese Association for Computational Sciences...

App móvil para reconocer texto en imágenes

Montes Llorente, Víctor Manuel
Fonte: Universidade Autônoma de Barcelona Publicador: Universidade Autônoma de Barcelona
Tipo: info:eu-repo/semantics/bachelorThesis; Text Formato: application/pdf
Publicado em 29/06/2015 SPA
Relevância na Pesquisa
27.77%
Los lectores OCR son una herramienta muy útil para poder escanear imágenes que disponen de texto y poder obtener posteriormente este texto. Esta es la idea principal de este proyecto, pero con el añadido de que este integrado en una aplicación Android que permita a cualquier usuario, a través de una imagen obtenida a partir de la cámara o el almacenamiento interno del dispositivo, que pueda obtener el texto de la imagen. Para realizar esta tarea, ha sido proporcionado el núcleo de procesamiento del OCR en el lenguaje C++, este núcleo se procesa en dos partes, la primera parte es procesada dentro del dispositivo Android, y la segunda es procesada a través de un servidor Apache, que posteriormente será el encargado de devolver el resultado al dispositivo, esto permitirá al usuario poder utilizar un traductor también integrado en la propia aplicación.; OCR readers are really a useful tool for scanning text-containing images and retrieve the text within them. That's the main target of this project, but adding an Android integration that allow the users to take a picture, either from the camera or internal storage, and get the text contained in it. To achieve that, an OCR core code developed in C++ has been provided. This core runs in two parts: the first one is handled by the Android device itself...

Optical character categorization: Clustering as it applies to OCR

Greenwald, Jennifer
Fonte: Rochester Instituto de Tecnologia Publicador: Rochester Instituto de Tecnologia
Tipo: Tese de Doutorado
EN_US
Relevância na Pesquisa
37.45%
I applied clustering analysis to the problem of creating tagged training data for optical character recognition (OCR). The creation of labeled character data by hand is a slow and cumbersome process. My belief is that clustering methods can be applied to character data before tagging it, allowing the operator to label entire groups of characters at once and greatly speeding the time in which tagged character data can be generated. This thesis will provide proof of concept as a basis for more in depth research and eventually the creation of a sophisticated application utilizing these techniques for the generation of labeled training data for OCR systems.

The Case of the 35 Gigabyte Digital Record: OCR and Digital Workflows

Rowan, Kelley F
Fonte: FIU Digital Commons Publicador: FIU Digital Commons
Tipo: text Formato: application/pdf
Relevância na Pesquisa
37.45%
This presentation was given at the Panhandle Library Access Network's (PLAN) Innovation Conference: Digitization- Preserving the Past for the Future Conference on August 14th, 2015. The presentation uses a specific collection of directories as a case study of the complications librarians and archivists face in digitizing older materials that may also be quite large, such as a directory. Prime OCR and Abbyy Fine Reader are discussed and their pros and cons covered. Troubleshooting and editing with Adobe Photoshop is also discussed.