Página 1 dos resultados de 1221 itens digitais encontrados em 0.007 segundos

Document engineering approaches toward scalable and structured multimedia, web and printable documents

PIMENTEL, Maria da Graca; BULTERMAN, Dick C. A.; SOARES, Luiz Fernando Gomes
Fonte: SPRINGER Publicador: SPRINGER
Tipo: Artigo de Revista Científica
ENG
Relevância na Pesquisa
46.31%
Document engineering is the computer science discipline that investigates systems for documents in any form and in all media. As with the relationship between software engineering and software, document engineering is concerned with principles, tools and processes that improve our ability to create, manage, and maintain documents (http://www.documentengineering.org). The ACM Symposium on Document Engineering is an annual meeting of researchers active in document engineering: it is sponsored by ACM by means of the ACM SIGWEB Special Interest Group. In this editorial, we first point to work carried out in the context of document engineering, which are directly related to multimedia tools and applications. We conclude with a summary of the papers presented in this special issue.

ActiveTimesheets: extending web-based multimedia documents with dynamic modification and reuse features

Martins, Diogo Santana; Pimentel, Maria da Graça Campos
Fonte: Association for Computing Machinery - ACM; ACM Special Interest Group on Hypertext and the Web - ACM SIGWEB; Fort Collins Publicador: Association for Computing Machinery - ACM; ACM Special Interest Group on Hypertext and the Web - ACM SIGWEB; Fort Collins
Tipo: Conferência ou Objeto de Conferência
ENG
Relevância na Pesquisa
46.37%
Methods for authoring Web-based multimedia presentations have advanced considerably with the improvements provided by HTML5. However, authors of these multimedia presentations still lack expressive, declarative language constructs to encode synchronized multimedia scenarios. The SMIL Timesheets language is a serious contender to tackle this problem as it provides alternatives to associate a declarative timing specification to an HTML document. However, in its current form, the SMIL Timesheets language does not meet important requirements observed in Web-based multimedia applications. In order to tackle this problem, this paper presents the ActiveTimesheets engine, which extends the SMIL Timesheets language by providing dynamic client- side modifications, temporal linking and reuse of temporal constructs in fine granularity. All these contributions are demonstrated in the context of a Web-based annotation and extension tool for multimedia documents.; CNPq (process no. 143144/2009-0); FAPESP (process no. 2013/03337-6)

Politicas de gerenciamento de web caches

Rodrigo Machado Oliveira
Fonte: Biblioteca Digital da Unicamp Publicador: Biblioteca Digital da Unicamp
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 22/12/1999 PT
Relevância na Pesquisa
46.47%
Atualmente, a Web tornou-se um dos principais gargalos no desempenho da Internet. Pesquisas recentes mostram que o parâmetro de Qualidade de Serviço (QoS) mais importante para os usuários da Web é o tempo de resposta na recuperação de objetos. Uma das alternativas para reduzir a latência na recuperação de documentos é a replicação de documentos populares em repositórios fisicamente próximos aos usuários, o que é denominado Web caching. As políticas de gerenciamento de Web caches têm grande influência na Qualidade de Serviço. As políticas de expulsão definem quais documentos devem ser retirados da cache, a fim de liberar espaço para um novo documento a ser armazenado. As políticas de admissão, por sua vez, procuram determinar quais documentos podem ser armazenados na cache. Esta dissertação investiga o tempo de recuperação de documentos como chave em políticas de expulsão e de controle de admissão. São propostas diversas políticas com o objetivo de diminuir o tempo de resposta na recuperação de objetos da Web. Além disso, é feita uma investigação sobre os tempos entre misses na cache, na tentativa de caracterizar seu tipo de distribuição; Recent surveys indicate that Web users consider the retrieval time the most important Quality of Service parameter. Web caches have been massively adopted in the Internet in order to reduce the retrieval time of documents as well as to alleviate the ever increasing Internet traffic due to Web traffic. Web cache management policies have great impact on the perceived Quality of Service. Removal policies define which documents should be removed from the cache...

Web metalaboratory = : Meta-laboratório na Web; Meta-laboratório na Web

Alessandra da Silva Gomes
Fonte: Biblioteca Digital da Unicamp Publicador: Biblioteca Digital da Unicamp
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 28/06/2013 PT
Relevância na Pesquisa
46.18%
Os dados científicos, serviços e ferramentas on-line disponíveis na Web oferecem oportunidades sem precedentes de conceber nos tipos de laboratório mixando recursos. Dados experimentais e coletados podem substanciar laboratórios assíncronos. Combinados com software apto a mashup é possível produzir laboratórios híbridos para confrontar, por exemplo, simulações sintéticas com observações. Este trabalho explora esta oportunidade no contexto da Educação através do nosso meta-laboratório, um ambiente de autoria para produzir laboratórios pela combinação de blocos de construção encapsulados em componentes. Introduzimos aqui os padrões de composição de laboratórios e os templates ativos para Web como mecanismos fundamentais para dar suporte à tarefa de autoria de laboratórios. Os laboratórios podem ser embutidos e mixados em documentos Web. Este trabalho mostra experimentos práticos da produção de laboratórios Web virtuais e híbridos.; The amount of scientific data, services and on-line tools available on the Web offer an unprecedented opportunity to conceive new kinds of laboratories blending resources. Existing experimental and collected data can substantiate asynchronous laboratories. Combined with mashup enabled software...

Using Neighbors to Date Web Documents

Sérgio Nunes; Cristina Ribeiro; Gabriel David
Fonte: Universidade do Porto Publicador: Universidade do Porto
Tipo: Artigo de Revista Científica
POR
Relevância na Pesquisa
46.42%
Time has been successfully used as a feature in web information retrieval tasks. In this context, estimating a documents inception date or last update date is a necessary task. Classic approaches have used HTTP header fields to estimate a documents last update time. The main problem with this approach is that it is applicable to a small part of web documents. In this work, we evaluate an alternative strategy based on a documents neighborhood. Using a random sample containing 10,000 URLs from the Yahoo! Directory, we study each documents links and media assets to determine its age. If we only consider isolated documents, we are able to date 52% of them. Including the documents neighborhood, we are able to estimate the date of more than 85\% of the same sample. Also, we find that estimates differ significantly according to the type of neighbors used. The most reliable estimates are based on the documents media assets, while the worst estimates are based on incoming links. These results are experimentally evaluated with a real world application using different datasets.

Information search in web archives

Costa, Miguel Ângelo Leal da, 1979-
Fonte: Universidade de Lisboa Publicador: Universidade de Lisboa
Tipo: Tese de Doutorado
Publicado em //2014 POR
Relevância na Pesquisa
46.19%
Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2014; Web archives preserve information that was published on the web or digitized from printed publications. Many of that information is unique and historically valuable. However, users do not have dedicated tools to find the desired information, which hampers the usefulness of web archives. This dissertation investigates solutions towards the advance of web archive information retrieval (WAIR) and contributes to the increase of knowledge about its technology and users. The thesis underlying this work is that the search results can be improved by exploiting temporal information intrinsic to web archives. This temporal information was leveraged from two different angles. First, the long-term persistence of web documents was analyzed and modeled to better estimate their relevance to a query. Second, a temporal-dependent ranking framework that learns and combines ranking models specific for each period was devised. This approach contrasts with a typical single-model approach that ignores the variance of web characteristics over time. The proposed approach was empirically validated through various controlled experiments that demonstrated their superiority over the state-of-the-art in WAIR.; Os arquivos da web preservam informação que foi publicada na web ou digitalizada de publicações impressas. Muita dessa informação é única e historicamente valiosa. Contudo...

Dynamic “Inline” Images: Context-Sensitive Retrieval and Integration of Images into Web Documents

Kahn, Charles E.
Fonte: Springer-Verlag Publicador: Springer-Verlag
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
46.54%
Integrating relevant images into web-based information resources adds value for research and education. This work sought to evaluate the feasibility of using “Web 2.0” technologies to dynamically retrieve and integrate pertinent images into a radiology web site. An online radiology reference of 1,178 textual web documents was selected as the set of target documents. The ARRS GoldMiner™ image search engine, which incorporated 176,386 images from 228 peer-reviewed journals, retrieved images on demand and integrated them into the documents. At least one image was retrieved in real-time for display as an “inline” image gallery for 87% of the web documents. Each thumbnail image was linked to the full-size image at its original web site. Review of 20 randomly selected Collaborative Hypertext of Radiology documents found that 69 of 72 displayed images (96%) were relevant to the target document. Users could click on the “More” link to search the image collection more comprehensively and, from there, link to the full text of the article. A gallery of relevant radiology images can be inserted easily into web pages on any web server. Indexing by concepts and keywords allows context-aware image retrieval, and searching by document title and subject metadata yields excellent results. These techniques allow web developers to incorporate easily a context-sensitive image gallery into their documents.

A Web-based Question Answering System

Zhang, Dell; Lee, Wee Sun
Fonte: MIT - Massachusetts Institute of Technology Publicador: MIT - Massachusetts Institute of Technology
Tipo: Artigo de Revista Científica Formato: 125500 bytes; application/pdf
EN_US
Relevância na Pesquisa
46.09%
The Web is apparently an ideal source of answers to a large variety of questions, due to the tremendous amount of information available online. This paper describes a Web-based question answering system LAMP, which is publicly accessible. A particular characteristic of this system is that it only takes advantage of the snippets in the search results returned by a search engine like Google. We think such “snippet-tolerant” property is important for an online question answering system to be practical, because it is time-consuming to download and analyze the original web documents. The performance of LAMP is comparable to the best state-of-the-art question answering systems.; Singapore-MIT Alliance (SMA)

Archivage du Web organisationnel dans une perspective archivistique

Chebbi, Aïda
Fonte: Université de Montréal Publicador: Université de Montréal
Tipo: Thèse ou Mémoire numérique / Electronic Thesis or Dissertation
FR
Relevância na Pesquisa
36.65%
Le Web représente actuellement un espace privilégié d’expression et d’activité pour plusieurs communautés, où pratiques communicationnelles et pratiques documentaires s’enrichissent mutuellement. Dans sa dimension visible ou invisible, le Web constitue aussi un réservoir documentaire planétaire caractérisé non seulement par l’abondance de l’information qui y circule, mais aussi par sa diversité, sa complexité et son caractère éphémère. Les projets d’archivage du Web en cours abordent pour beaucoup cette question du point de vue de la préservation des publications en ligne sans la considérer dans une perspective archivistique. Seuls quelques projets d’archivage du Web visent la préservation du Web organisationnel ou gouvernemental. La valeur archivistique du Web, notamment du Web organisationnel, ne semble pas être reconnue malgré un effort soutenu de certaines archives nationales à diffuser des politiques d’archivage du Web organisationnel. La présente thèse a pour but de développer une meilleure compréhension de la nature des archives Web et de documenter les pratiques actuelles d’archivage du Web organisationnel. Plus précisément, cette recherche vise à répondre aux trois questions suivantes : (1) Que recommandent en général les politiques d’archivage du Web organisationnel? (2) Quelles sont les principales caractéristiques des archives Web? (3) Quelles pratiques d’archivage du Web organisationnel sont mises en place dans des organisations au Québec? Pour répondre à ces questions...

Tailoring dynamic ontology-driven web documents by demonstration

Macías, José A.; Castells, Pablo
Fonte: Institute of Electrical and Electronics Engineers Publicador: Institute of Electrical and Electronics Engineers
Tipo: conferenceObject; bookPart
ENG
Relevância na Pesquisa
46.27%
Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. A. Macías, and P. Castells, "Tailoring dynamic ontology-driven web documents by demonstration", in Sixth International Conference on Information Visualisation, London, 2002, pp. 535 - 540.; In this paper we present DESK, an authoring tool for the automatic customisation of the front-end of web applications as a result of changes that users perform in dynamically generated HTML pages. Our authoring tool uses domain knowledge and presentation knowledge stored in PEGASUS, an automatic web page generation system for rendering ontology-driven knowledge. DESK automatically detects the differences between original and modified web pages and uses heuristics to infer additional knowledge for modelling the context of each change. DESK uses an explicit user model to identify which user can perform each kind of change to the web presentation.

Evaluating the informative quality of web documents using fuzzy linguistic techniques

Peis, E.; Herrera-Viedma, E.; Anaya, K.; Herrera, J.C.
Fonte: Third Conference of the European Society for Fuzzy Logic and Technology Publicador: Third Conference of the European Society for Fuzzy Logic and Technology
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
46.3%
Recommender systems evaluate and filter the great amount of information available on the Web to assist people in their search processes. A fuzzy linguistic evaluation method of Web documents is presented to generate recommendations. Given an XML document type (e.g. scientific article), we consider that its components are not equally informative. This is indicated by defining linguistic importance attributes to the more meaningful elements of the XML Schema designed for Web documents. The evaluation method generates linguistic recommendations according to linguistic evaluation judgements provided by different recommenders on meaningful elements.

Information search and similarity based on Web 2.0 and semantic technologies

Fuentes Lorenzo, Damaris
Fonte: Universidade Carlos III de Madrid Publicador: Universidade Carlos III de Madrid
Tipo: Tese de Doutorado
ENG
Relevância na Pesquisa
36.53%
The World Wide Web provides a huge amount of information described in natural language at the current society’s disposal. Web search engines were born from the necessity of finding a particular piece of that information. Their ease of use and their utility have turned these engines into one of the most used web tools at a daily basis. To make a query, users just have to introduce a set of words - keywords - in natural language and the engine answers with a list of ordered resources which contain those words. The order is given by ranking algorithms. These algorithms use basically two types of features: dynamic and static factors. The dynamic factor has into account the query; that is, those documents which contain the keywords used to describe the query are more relevant for that query. The hyperlinks structure among documents is an example of a static factor of most current algorithms. For example, if most documents link to a particular document, this document may have more relevance than others because it is more popular. Even though currently there is a wide consensus on the good results that the majority of web search engines provides, these tools still suffer from some limitations, basically 1) the loneliness of the searching activity itself; and 2) the simple recovery process...

Relational Views of XML for the Semantic Web

Atre, Shruti
Fonte: Quens University Publicador: Quens University
Tipo: Tese de Doutorado Formato: 1977319 bytes; application/pdf
EN; EN
Relevância na Pesquisa
46.42%
The Semantic Web is the future of the Internet. It is the extension to the Internet in which information will be given well-defined meaning, enabling not only humans but also machines to find, share and combine information more easily. In the Semantic Web documents are not merely pages containing a set of words that form their content. They also encode the meaning and structure of those words. This enables various information retrieval techniques to be performed on the documents in addition to the ones restricted to keywords. The goal of this research is to explore a method for querying the Semantic Web using relational database theory and source transformation techniques. We take as input, documents annotated with XML mark-up and the information tags that we are interested in. We then extract and populate a relational view on the annotated XML documents using these tags and the implicit relations in the XML documents. We evaluate the feasibility of our system by testing on a variety of input and we also explore the kinds of queries that can be made on the extracted relational view.; Thesis (Master, Computing) -- Queen's University, 2007-09-27 10:56:13.513

Adaptação de conteúdos Web para o ambiente WAP

Santos, Arlindo
Fonte: Universidade do Porto, Faculdade de Engenharia Publicador: Universidade do Porto, Faculdade de Engenharia
Tipo: Dissertação de Mestrado
POR
Relevância na Pesquisa
46.34%
Um dos actuais desafios da Internet surge a partir do momento que a tecnologia WAP permite a utilização de dispositivos móveis para aceder à informação disponível na Internet formatada de acordo com uma especificação própria. Mas os dispositivos móveis apenas podem aceder a conteúdos multimédia projectados de acordo com uma especificação que difere daquela que é usada para disponibilizar conteúdos para os computadores pessoais. E como um dos actuais entraves para o sucesso da tecnologia WAP é sem dúvida a escassez de conteúdos quando comparada com quantidade de informação Web. É portanto necessário criar mecanismos de enriquecimento através da adaptação da informação existente na Web ou simplesmente através do desenvolvimento de conteúdos próprios. Esta dissertação tem como objectivo apresentar uma aplicação responsável pela adaptação automática de conteúdos disponibilizados na World Wide Web para o ambiente WAP, e assim permitir que os utilizadores móveis possam aceder a informação que até agora era acessível apenas a computadores pessoais. Para efectuar o desenvolvimento dessa aplicação foi necessário investigar os métodos e técnicas associados ao processo de conversão da linguagem de formatação dos documentos da Web (HTML) para a linguagem de formatação normalizada para o ambiente WAP (WML). One of the current challenges of the Internet derives from the fact that the WAP technology allows the use of mobile devices to access information available on the Internet shaped according to a specific standard. However mobile devices only access multimedia content if it is configured according to a standard which differs from that used for personal computers. And one of the existing obstacles for the success of WAP technology is...

Evaluating the informative quality of web sites by fuzzy computing with words

Peis, E.; Herrera-Viedma, E.; Olvera Lobo, Mar??a Dolores; Herrera, J.C.; Hassan-Montero, Yusef
Fonte: Atlantic Web Intelligence Conference Publicador: Atlantic Web Intelligence Conference
Tipo: Artigo de Revista Científica
ENG
Relevância na Pesquisa
46.43%
In this paper we present a method based on fuzzy computing with words to measure the informative quality of Web sites used to publish information stored in XML documents. This method generates linguistic recommendations on the informative quality of Web sites. This method is made up of both an evaluation scheme to analyze the informative quality of such Web sites and a generation method of linguistic recommendations. The evaluation scheme presents both technical criteria of Web site design and criteria related to the content of information of Web sites. It is oriented to the user because the chosen criteria are user friendly, in such a way that visitors to a Web site can assess them by means of linguistic evaluation judgements. The generation method generates linguistic recommendations of Web sites based on those linguistic evaluation judgements using the LOWA and LWA operators. Then, when a user looks for information on the Web we can help him/her with both recommendations on Web sites which store the retrieved documents and also recommendations on other Web sites which store other documents of interest related to his/her information needs. With this proposal information filtering and evaluation possibilities on the Web are increased.

Web Document Clustering and Ranking using Tf-Idf based Apriori Approach

Roul, R. K.; Devanand, O. R.; Sahay, S. K.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 21/06/2014
Relevância na Pesquisa
36.51%
The dynamic web has increased exponentially over the past few years with more than thousands of documents related to a subject available to the user now. Most of the web documents are unstructured and not in an organized manner and hence user facing more difficult to find relevant documents. A more useful and efficient mechanism is combining clustering with ranking, where clustering can group the similar documents in one place and ranking can be applied to each cluster for viewing the top documents at the beginning.. Besides the particular clustering algorithm, the different term weighting functions applied to the selected features to represent web document is a main aspect in clustering task. Keeping this approach in mind, here we proposed a new mechanism called Tf-Idf based Apriori for clustering the web documents. We then rank the documents in each cluster using Tf-Idf and similarity factor of documents based on the user query. This approach will helps the user to get all his relevant documents in one place and can restrict his search to some top documents of his choice. For experimental purpose, we have taken the Classic3 and Classic4 datasets of Cornell University having more than 10,000 documents and use gensim toolkit to carry out our work. We have compared our approach with traditional apriori algorithm and found that our approach is giving better results for higher minimum support. Our ranking mechanism is also giving a good F-measure of 78%.; Comment: 5 Pages

Fuzzy clustering of web documents using equivalence relations and fuzzy hierarchical clustering

kumar, Satendra; kathuria, Mamta; Gupta, Alok Kumar; Rani, Monika
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 06/06/2014
Relevância na Pesquisa
46.1%
The conventional clustering algorithms have difficulties in handling the challenges posed by the collection of natural data which is often vague and uncertain. Fuzzy clustering methods have the potential to manage such situations efficiently. Fuzzy clustering method is offered to construct clusters with uncertain boundaries and allows that one object belongs to one or more clusters with some membership degree. In this paper, an algorithm and experimental results are presented for fuzzy clustering of web documents using equivalence relations and fuzzy hierarchical clustering.; Comment: 5 pages, Software Engineering (CONSEG), 2012

Semantic-Sensitive Web Information Retrieval Model for HTML Documents

Bassil, Youssef; Semaan, Paul
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 01/04/2012
Relevância na Pesquisa
36.54%
With the advent of the Internet, a new era of digital information exchange has begun. Currently, the Internet encompasses more than five billion online sites and this number is exponentially increasing every day. Fundamentally, Information Retrieval (IR) is the science and practice of storing documents and retrieving information from within these documents. Mathematically, IR systems are at the core based on a feature vector model coupled with a term weighting scheme that weights terms in a document according to their significance with respect to the context in which they appear. Practically, Vector Space Model (VSM), Term Frequency (TF), and Inverse Term Frequency (IDF) are among other long-established techniques employed in mainstream IR systems. However, present IR models only target generic-type text documents, in that, they do not consider specific formats of files such as HTML web documents. This paper proposes a new semantic-sensitive web information retrieval model for HTML documents. It consists of a vector model called SWVM and a weighting scheme called BTF-IDF, particularly designed to support the indexing and retrieval of HTML web documents. The chief advantage of the proposed model is that it assigns extra weights for terms that appear in certain pre-specified HTML tags that are correlated to the semantics of the document. Additionally...

Reconstructing the Boundary of a Web Document

Sweet, James
Fonte: Rochester Instituto de Tecnologia Publicador: Rochester Instituto de Tecnologia
Tipo: Tese de Doutorado
EN_US
Relevância na Pesquisa
46.51%
Documents found on the World Wide Web (WWW) may be composed of a single web page, or several web pages that are linked together by a table of contents or some other commonly known document construct. When a document spans multiple web pages, it is often inconvenient to print or download the entire document using available tools. This thesis introduces a concept called the document boundary to facilitate representation and analysis of multi-page web documents, and suggests a two-phase approach towards automated identification of document boundaries. In the first phase, individual pages are examined to determine which links are most likely to represent an intra-document link. This procedure is applied recursively to identify a group of candidate pages which may be part of the same document. In the second phase, the link topology and other features of the identified pages are examined in aggregate for indications of a multi-page document. A test suite of both single- and multi-page web documents was assembled using a mixture of handpicked documents and documents which were gathered by an arbitrary third party. The document boundary detection system was applied to the main page of each document. The document boundary detection system was able to achieve a success rate of 73% when its results were compared to the ground truth documents.

Investigation into the use of the World Wide Web as an interface for distributing electronic documents to and from a remote digital color printing site

Recene, Ronald J.
Fonte: Rochester Instituto de Tecnologia Publicador: Rochester Instituto de Tecnologia
Tipo: Tese de Doutorado Formato: 4461201 bytes; 1881 bytes; application/pdf; text/plain
EN_US
Relevância na Pesquisa
36.47%
The World Wide Web and Internet are the most talked-about and fastest-growing mediums for information and electronic document distribution. Their growth has, and will continue to have, a great impact on all forms of media, due to their potential to reach millions of individuals. This project demonstrates the capabilities of the World Wide Web to perform, not only as a publishing vehicle, but as a means for communication and document distribution to a digital color printing facility. In order to show this, a Web site was built that incorporated the utilities needed for the successful exchange of data, such as links to additional software applications available on the Web, downloadable ICC Color Management profiles of the digital color press, a hypertext job estimate/information form, an uploadable FTP server, and directions on how to use the service and create the appropriate files. The result is a functional Web-based printing facility that eliminates the restrictions associated with geographical boundaries. The test to see if this site functioned properly was the successful implementation of the aforementioned applications and tools to create actual documents. Those documents, when put through the developed workflow, must exhibit the designers' original intent when reproduced on a remote digital press and when compared to their originals reproduced on that same press. The written portion of this thesis documents the procedures and rationale behind the methodology used.; School of Printing. Thesis (M.S.)--Rochester Institute of Technology...