Página 1 dos resultados de 15 itens digitais encontrados em 0.005 segundos

## Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data

Fonte: Oxford University Press
Publicador: Oxford University Press

Tipo: Artigo de Revista Científica

EN

Relevância na Pesquisa

25.88%

DNA microarray technology provides a promising approach to the diagnosis and prognosis of tumors on a genome-wide scale by monitoring the expression levels of thousands of genes simultaneously. One problem arising from the use of microarray data is the difficulty to analyze the high-dimensional gene expression data, typically with thousands of variables (genes) and much fewer observations (samples), in which severe collinearity is often observed. This makes it difficult to apply directly the classical statistical methods to investigate microarray data. In this paper, total principal component regression (TPCR) was proposed to classify human tumors by extracting the latent variable structure underlying microarray data from the augmented subspace of both independent variables and dependent variables. One of the salient features of our method is that it takes into account not only the latent variable structure but also the errors in the microarray gene expression profiles (independent variables). The prediction performance of TPCR was evaluated by both leave-one-out and leave-half-out cross-validation using four well-known microarray datasets. The stabilities and reliabilities of the classification models were further assessed by re-randomization and permutation studies. A fast kernel algorithm was applied to decrease the computation time dramatically. (MATLAB source code is available upon request.)

Link permanente para citações:

## An Intelligent Architecture Based on Field Programmable Gate Arrays Designed to Detect Moving Objects by Using Principal Component Analysis

Fonte: Molecular Diversity Preservation International (MDPI)
Publicador: Molecular Diversity Preservation International (MDPI)

Tipo: Artigo de Revista Científica

Publicado em 15/10/2010
EN

Relevância na Pesquisa

26.01%

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.

Link permanente para citações:

## Computing matrix symmetrizers. Part 2: new methods using eigendata and linear means; a comparison

Fonte: Elsevier
Publicador: Elsevier

Tipo: info:eu-repo/semantics/acceptedVersion; info:eu-repo/semantics/article

Publicado em 10/07/2015
ENG

Relevância na Pesquisa

46.02%

#Symmetric matrix factorization#symmetrizer#symmetrizer computation#eigenvalue method#linear equation#principal subspace computation#matrix optimization#numerical algorithm#MATLAB code#Matemáticas

Over any field F every square matrix A can be factored into the product of two symmetric matrices as A = S1 . S2 with S_i = S_i^T ∈ F^(n,n) and either factor can be chosen nonsingular, as was discovered by Frobenius in 1910. Frobenius’ symmetric matrix factorization has been lying almost dormant for a century. The first successful method for computing matrix symmetrizers, i.e., symmetric matrices S such that SA is symmetric, was inspired by an iterative linear systems algorithm of Huang and Nong (2010) in 2013 [29, 30]. The resulting iterative algorithm has solved this computational problem over R and C, but at high computational cost. This paper develops and tests another linear equations solver, as well as eigen- and principal vector or Schur Normal Form based algorithms for solving the matrix symmetrizer problem numerically. Four new eigendata based algorithms use, respectively, SVD based principal vector chain constructions, Gram-Schmidt orthogonalization techniques, the Arnoldi method, or the Schur Normal Form of A in their formulations. They are helped by Datta’s 1973 method that symmetrizes unreduced Hessenberg matrices directly. The eigendata based methods work well and quickly for generic matrices A and create well conditioned matrix symmetrizers through eigenvector dyad accumulation. But all of the eigen based methods have differing deficiencies with matrices A that have ill-conditioned or complicated eigen structures with nontrivial Jordan normal forms. Our symmetrizer studies for matrices with ill-conditioned eigensystems lead to two open problems of matrix optimization.; This research was partially supported by the Ministerio de Economía y Competitividad of Spain through the research grant MTM2012-32542.

Link permanente para citações:

## Robust Orthogonal Complement Principal Component Analysis

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Relevância na Pesquisa

45.94%

Recently, the robustification of principal component analysis has attracted
lots of attention from statisticians, engineers and computer scientists. This
work focuses on the type of outliers that are not necessarily apparent in the
original observation space but could affect the principal subspace estimation.
Based on a mathematical formulation of such transformed outliers, a novel
robust orthogonal complement principal component analysis (ROC-PCA) is
proposed. The framework combines the popular sparsity-enforcing and low rank
regularization techniques to deal with row-wise outliers as well as
element-wise outliers. A non-asymptotic oracle inequality guarantees the
performance of ROC-PCA in finite samples. To tackle the computational
challenges, an efficient algorithm is developed on the basis of Stiefel
manifold optimization and iterative thresholding. Furthermore, a batch variant
is proposed to significantly reduce the cost in ultra high dimensions. The
paper also points out a pitfall of a common practice of SVD reduction in robust
PCA. Experiments show the effectiveness and efficiency of ROC-PCA in simulation
studies and real data analysis.

Link permanente para citações:

## Optimization theory of Hebbian/anti-Hebbian networks for PCA and whitening

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 30/11/2015

Relevância na Pesquisa

25.88%

In analyzing information streamed by sensory organs, our brains face
challenges similar to those solved in statistical signal processing. This
suggests that biologically plausible implementations of online signal
processing algorithms may model neural computation. Here, we focus on such
workhorses of signal processing as Principal Component Analysis (PCA) and
whitening which maximize information transmission in the presence of noise. We
adopt the similarity matching framework, recently developed for principal
subspace extraction, but modify the existing objective functions by adding a
decorrelating term. From the modified objective functions, we derive online PCA
and whitening algorithms which are implementable by neural networks with local
learning rules, i.e. synaptic weight updates that depend on the activity of
only pre- and postsynaptic neurons. Our theory offers a principled model of
neural computations and makes testable predictions such as the dropout of
underutilized neurons.; Comment: Annual Allerton Conference on Communication, Control, and Computing
(Allerton) 2015

Link permanente para citações:

## Relations among Some Low Rank Subspace Recovery Models

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 05/12/2014

Relevância na Pesquisa

26.1%

Recovering intrinsic low dimensional subspaces from data distributed on them
is a key preprocessing step to many applications. In recent years, there has
been a lot of work that models subspace recovery as low rank minimization
problems. We find that some representative models, such as Robust Principal
Component Analysis (R-PCA), Robust Low Rank Representation (R-LRR), and Robust
Latent Low Rank Representation (R-LatLRR), are actually deeply connected. More
specifically, we discover that once a solution to one of the models is
obtained, we can obtain the solutions to other models in closed-form
formulations. Since R-PCA is the simplest, our discovery makes it the center of
low rank subspace recovery models. Our work has two important implications.
First, R-PCA has a solid theoretical foundation. Under certain conditions, we
could find better solutions to these low rank models at overwhelming
probabilities, although these models are non-convex. Second, we can obtain
significantly faster algorithms for these models by solving R-PCA first. The
computation cost can be further cut by applying low complexity randomized
algorithms, e.g., our novel $\ell_{2,1}$ filtering algorithm, to R-PCA.
Experiments verify the advantages of our algorithms over other state-of-the-art
ones that are based on the alternating direction method.; Comment: Submitted to Neural Computation

Link permanente para citações:

## Nonparametric Partial Importance Sampling for Financial Derivative Pricing

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Relevância na Pesquisa

25.9%

Importance sampling is a promising variance reduction technique for Monte
Carlo simulation based derivative pricing. Existing importance sampling methods
are based on a parametric choice of the proposal. This article proposes an
algorithm that estimates the optimal proposal nonparametrically using a
multivariate frequency polygon estimator. In contrast to parametric methods,
nonparametric estimation allows for close approximation of the optimal
proposal. Standard nonparametric importance sampling is inefficient for
high-dimensional problems. We solve this issue by applying the procedure to a
low-dimensional subspace, which is identified through principal component
analysis and the concept of the effective dimension. The mean square error
properties of the algorithm are investigated and its asymptotic optimality is
shown. Quasi-Monte Carlo is used for further improvement of the method. It is
easy to implement, particularly it does not require any analytical computation,
and it is computationally very efficient. We demonstrate through path-dependent
and multi-asset option pricing problems that the algorithm leads to significant
efficiency gains compared to other algorithms in the literature.; Comment: 26 pages, 4 figures

Link permanente para citações:

## Some Options for L1-Subspace Signal Processing

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 04/09/2013

Relevância na Pesquisa

26%

We describe ways to define and calculate $L_1$-norm signal subspaces which
are less sensitive to outlying data than $L_2$-calculated subspaces. We focus
on the computation of the $L_1$ maximum-projection principal component of a
data matrix containing N signal samples of dimension D and conclude that the
general problem is formally NP-hard in asymptotically large N, D. We prove,
however, that the case of engineering interest of fixed dimension D and
asymptotically large sample support N is not and we present an optimal
algorithm of complexity $O(N^D)$. We generalize to multiple
$L_1$-max-projection components and present an explicit optimal $L_1$ subspace
calculation algorithm in the form of matrix nuclear-norm evaluations. We
conclude with illustrations of $L_1$-subspace signal processing in the fields
of data dimensionality reduction and direction-of-arrival estimation.; Comment: In Proceedings Tenth Intern. Symposium on Wireless Communication
Systems (ISWCS '13), Ilmenau, Germany, Aug. 27-30, 2013 (The 2013 ISWCS Best
Paper Award in Physical Layer Comm. and Signal Processing); 5 pages; 3
figures

Link permanente para citações:

## Optimal Algorithms for $L_1$-subspace Signal Processing

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 27/05/2014

Relevância na Pesquisa

26.12%

We describe ways to define and calculate $L_1$-norm signal subspaces which
are less sensitive to outlying data than $L_2$-calculated subspaces. We start
with the computation of the $L_1$ maximum-projection principal component of a
data matrix containing $N$ signal samples of dimension $D$. We show that while
the general problem is formally NP-hard in asymptotically large $N$, $D$, the
case of engineering interest of fixed dimension $D$ and asymptotically large
sample size $N$ is not. In particular, for the case where the sample size is
less than the fixed dimension ($N

Link permanente para citações:

## Fast, Exact Bootstrap Principal Component Analysis for p>1 million

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Relevância na Pesquisa

36.17%

Many have suggested a bootstrap procedure for estimating the sampling
variability of principal component analysis (PCA) results. However, when the
number of measurements per subject ($p$) is much larger than the number of
subjects ($n$), the challenge of calculating and storing the leading principal
components from each bootstrap sample can be computationally infeasible. To
address this, we outline methods for fast, exact calculation of bootstrap
principal components, eigenvalues, and scores. Our methods leverage the fact
that all bootstrap samples occupy the same $n$-dimensional subspace as the
original sample. As a result, all bootstrap principal components are limited to
the same $n$-dimensional subspace and can be efficiently represented by their
low dimensional coordinates in that subspace. Several uncertainty metrics can
be computed solely based on the bootstrap distribution of these low dimensional
coordinates, without calculating or storing the $p$-dimensional bootstrap
components. Fast bootstrap PCA is applied to a dataset of sleep
electroencephalogram (EEG) recordings ($p=900$, $n=392$), and to a dataset of
brain magnetic resonance images (MRIs) ($p\approx$ 3 million, $n=352$). For the
brain MRI dataset, our method allows for standard errors for the first 3
principal components based on 1000 bootstrap samples to be calculated on a
standard laptop in 47 minutes...

Link permanente para citações:

## Monotonicity of quantum relative entropy and recoverability

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Relevância na Pesquisa

15.8%

The relative entropy is a principal measure of distinguishability in quantum
information theory, with its most important property being that it is
non-increasing with respect to noisy quantum operations. Here, we establish a
remainder term for this inequality that quantifies how well one can recover
from a loss of information by employing a rotated Petz recovery map. The main
approach for proving this refinement is to combine the methods of [Fawzi and
Renner, arXiv:1410.0664] with the notion of a relative typical subspace from
[Bjelakovic and Siegmund-Schultze, arXiv:quant-ph/0307170]. Our paper
constitutes partial progress towards a remainder term which features just the
Petz recovery map (not a rotated Petz map), a conjecture which would have many
consequences in quantum information theory.
A well known result states that the monotonicity of relative entropy with
respect to quantum operations is equivalent to each of the following
inequalities: strong subadditivity of entropy, concavity of conditional
entropy, joint convexity of relative entropy, and monotonicity of relative
entropy with respect to partial trace. We show that this equivalence holds true
for refinements of all these inequalities in terms of the Petz recovery map. So
either all of these refinements are true or all are false.; Comment: v3: 22 pages...

Link permanente para citações:

## Distributed Kernel Principal Component Analysis

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Relevância na Pesquisa

25.94%

Kernel Principal Component Analysis (KPCA) is a key technique in machine
learning for extracting the nonlinear structure of data and pre-processing it
for downstream learning algorithms. We study the distributed setting in which
there are multiple workers, each holding a set of points, who wish to compute
the principal components of the union of their pointsets. Our main result is a
communication efficient algorithm that takes as input arbitrary data points and
computes a set of global principal components, that give relative-error
approximation for polynomial kernels, or give relative-error approximation with
an arbitrarily small additive error for a wide family of kernels including
Gaussian kernels.
While recent work shows how to do PCA in a distributed setting, the kernel
setting is significantly more challenging. Although the "kernel trick" is
useful for efficient computation, it is unclear how to use it to reduce
communication. The main problem with previous work is that it achieves
communication proportional to the dimension of the data points, which would be
proportional to the dimension of the feature space, or to the number of
examples, both of which could be very large. We instead first select a small
subset of points whose span contains a good approximation (the column subset
selection problem...

Link permanente para citações:

## A Hebbian/Anti-Hebbian Neural Network for Linear Subspace Learning: A Derivation from Multidimensional Scaling of Streaming Data

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 02/03/2015

Relevância na Pesquisa

36.31%

#Quantitative Biology - Neurons and Cognition#Computer Science - Neural and Evolutionary Computing#Statistics - Machine Learning

Neural network models of early sensory processing typically reduce the
dimensionality of streaming input data. Such networks learn the principal
subspace, in the sense of principal component analysis (PCA), by adjusting
synaptic weights according to activity-dependent learning rules. When derived
from a principled cost function these rules are nonlocal and hence biologically
implausible. At the same time, biologically plausible local rules have been
postulated rather than derived from a principled cost function. Here, to bridge
this gap, we derive a biologically plausible network for subspace learning on
streaming data by minimizing a principled cost function. In a departure from
previous work, where cost was quantified by the representation, or
reconstruction, error, we adopt a multidimensional scaling (MDS) cost function
for streaming data. The resulting algorithm relies only on biologically
plausible Hebbian and anti-Hebbian local learning rules. In a stochastic
setting, synaptic weights converge to a stationary state which projects the
input data onto the principal subspace. If the data are generated by a
nonstationary distribution, the network can track the principal subspace. Thus,
our result makes a step towards an algorithmic theory of neural computation.; Comment: Accepted for publication in Neural Computation

Link permanente para citações:

## Probabilistic Approach to Neural Networks Computation Based on Quantum Probability Model Probabilistic Principal Subspace Analysis Example

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 24/01/2010

Relevância na Pesquisa

25.8%

In this paper, we introduce elements of probabilistic model that is suitable
for modeling of learning algorithms in biologically plausible artificial neural
networks framework. Model is based on two of the main concepts in quantum
physics - a density matrix and the Born rule. As an example, we will show that
proposed probabilistic interpretation is suitable for modeling of on-line
learning algorithms for PSA, which are preferably realized by a parallel
hardware based on very simple computational units. Proposed concept (model) can
be used in the context of improving algorithm convergence speed, learning
factor choice, or input signal scale robustness. We are going to see how the
Born rule and the Hebbian learning rule are connected

Link permanente para citações:

## Riemannian Geometry of Grassmann Manifolds with a View on Algorithmic Computation

Fonte: Kluwer Academic Publishers
Publicador: Kluwer Academic Publishers

Tipo: Artigo de Revista Científica

Relevância na Pesquisa

25.93%

#Keywords: Algorithms#Matrix algebra#Nonlinear equations#Optimization#Problem solving#Set theory#Theorem proving#Euclidean space#Grassmann manifolds#Invariant subspace#Newton method

We give simple formulas for the canonical metric, gradient, Lie derivative, Riemannian connection, parallel translation, geodesics and distance on the Grassmann manifold of p-planes in ℝn. In these formulas, p-planes are represented as the column space

Link permanente para citações: