Following is the list of accepted ICASSP 2021 papers, sorted by paper title. You can use the search feature of your web browser to find your paper number. Notifications to all authors have also been sent by email. If you have not received your notification of the results by email, please contact us at papers@2021.ieeeicassp.org.
Paper Number | Paper Title |
---|---|
1269 | (W)EARABLE MICROPHONE ARRAY AND ULTRASONIC ECHO LOCALIZATION FOR COARSE INDOOR ENVIRONMENT MAPPING |
2490 | 2D-FRFT BASED FREQUENCY SHIFT-INVARIANT DIGITAL IMAGE ENCRYPTION |
1705 | 3D MULTIZONE SOUNDFIELD REPRODUCTION IN A REVERBERANT ENVIRONMENT USING INTENSITY MATCHING METHOD |
2715 | A Bayesian Inference Approach for Location-Based Micro Motions using Radio Frequency Sensing |
2422 | A BAYESIAN INTERPRETATION OF THE LIGHT GATED RECURRENT UNIT |
4082 | A Better and Faster End-to-End Model for Streaming ASR |
5065 | A BIAS-REDUCING LOSS FUNCTION FOR CT IMAGE DENOISING |
3363 | A CAPSULE NETWORK BASED APPROACH FOR DETECTION OF AUDIO SPOOFING ATTACKS |
2087 | A CAUSAL DEEP LEARNING FRAMEWORK FOR CLASSIFYING PHONEMES IN COCHLEAR IMPLANTS |
3014 | A CHAPTER-WISE UNDERSTANDING SYSTEM FOR TEXT-TO-SPEECH IN CHINESE NOVELS |
4461 | A CLASSIFIER FOR IMPROVING CAUSE AND EFFECT IN SSVEP-BASED BCIS FOR INDIVIDUALS WITH COMPLEX COMMUNICATION DISORDERS |
3835 | A CLOSED-LOOP GAIN-CONTROL FEEDBACK MODEL FOR THE MEDIAL EFFERENT SYSTEM OF THE DESCENDING AUDITORY PATHWAY |
2665 | A CLOSER LOOK AT AUDIO-VISUAL MULTI-PERSON SPEECH RECOGNITION AND ACTIVE SPEAKER SELECTION |
2052 | A CO-INTERACTIVE TRANSFORMER FOR JOINT SLOT FILLING AND INTENT DETECTION |
2403 | A COLOR DOPPLER PROCESSING ENGINE WITH AN ADAPTIVE CLUTTER FILTER FOR PORTABLE ULTRASOUND IMAGING DEVICES |
4778 | A COMPACT JOINT DISTILLATION NETWORK FOR VISUAL FOOD RECOGNITION |
3882 | A COMPARATIVE STUDY OF ACOUSTIC AND LINGUISTIC FEATURES CLASSIFICATION FOR ALZHEIMER’S DISEASE DETECTION |
3630 | A COMPARISON OF CONVOLUTIONAL NEURAL NETWORKS FOR GLOTTAL CLOSURE INSTANT DETECTION FROM RAW SPEECH |
2210 | A Comparison of Discrete Latent Variable Models for Speech Representation Learning |
5143 | A COMPARISON OF METHODS FOR OOV-WORD RECOGNITION ON A NEW PUBLIC DATASET |
4262 | A COMPARISON STUDY ON INFANT-PARENT VOICE DIARIZATION |
3645 | A CONSENSUS EQUILIBRIUM SOLUTION FOR DEEP IMAGE PRIOR POWERED BY RED |
2384 | A CONVEX PENALTY FOR BLOCK-SPARSE SIGNALS WITH UNKNOWN STRUCTURES |
2719 | A CORRENTROPY BASED ALGORITHM FOR ROBUST LOCALIZATION IN WIRELESS NETWORKS |
3389 | A CURATED DATASET OF URBAN SCENES FOR AUDIO-VISUAL SCENE ANALYSIS |
4386 | A DECENTRALIZED VARIANCE-REDUCED METHOD FOR STOCHASTIC OPTIMIZATION OVER DIRECTED GRAPHS |
4113 | A DEEP REINFORCEMENT LEARNING APPROACH TO AUDIO-BASED NAVIGATION IN A MULTI-SPEAKER ENVIRONMENT |
2732 | A DEEP SPATIO-TEMPORAL MODEL FOR EEG-BASED IMAGINED SPEECH RECOGNITION |
2189 | A Diffusion FxLMS Algorithm for Multi-Channel Active Noise Control and Variable Spatial Smoothing |
2106 | A DNN AUTOENCODER FOR AUTOMOTIVE RADAR INTERFERENCE MITIGATION |
3750 | A Dynamical Systems Perspective on Online Bayesian Nonparametric Estimators with Adaptive Hyperparameters |
2039 | A FAST AND EFFICIENT NETWORK FOR SINGLE IMAGE DERAINING |
3483 | A FAST RANDOMIZED ADAPTIVE CP DECOMPOSITION FOR STREAMING TENSORS |
2391 | A FEATURES DECOUPLING METHOD FOR MULTIPLE MANIPULATIONS IDENTIFICATION IN IMAGE OPERATION CHAINS |
3293 | A FLOW-BASED NEURAL NETWORK FOR TIME DOMAIN SPEECH ENHANCEMENT |
2261 | A FRAMEWORK FOR PRUNING DEEP NEURAL NETWORKS USING ENERGY-BASED MODELS |
3155 | A FURTHER STUDY OF UNSUPERVISED PRETRAINING FOR TRANSFORMER BASED SPEECH RECOGNITION |
4036 | A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks |
3799 | A GENERAL NETWORK ARCHITECTURE FOR SOUND EVENT LOCALIZATION AND DETECTION USING TRANSFER LEARNING AND RECURRENT NEURAL NETWORK |
5624 | A GENERALIZED ACCELERATED COMPOSITE GRADIENT METHOD: UNITING NESTEROV'S FAST GRADIENT METHOD AND FISTA |
2965 | A GLOBAL CAYLEY PARAMETRIZATION OF STIEFEL MANIFOLD\\FOR DIRECT UTILIZATION OF OPTIMIZATION MECHANISMS OVER VECTOR SPACES |
1849 | A GLOBAL-LOCAL ATTENTION FRAMEWORK FOR WEAKLY LABELLED AUDIO TAGGING |
4457 | A Graph Learning Algorithm Based on Gaussian Markov Random Fields and Minimax Concave Penalty |
4058 | A HIERARCHICAL SUBSPACE MODEL FOR LANGUAGE-ATTUNED ACOUSTIC UNIT DISCOVERY |
3242 | A HIGH-FRAME-RATE EYE-TRACKING FRAMEWORK FOR MOBILE DEVICES |
2687 | A HOMOGENEITY-BASED MULTISCALE HYPERSPECTRAL IMAGE REPRESENTATION FOR SPARSE SPECTRAL UNMIXING |
4185 | A HYBRID APPROACH TO CODED COMPRESSED SENSING WHERE COUPLING TAKES PLACE VIA THE OUTER CODE |
1545 | A HYBRID CNN-BILSTM VOICE ACTIVITY DETECTOR |
3449 | A HYBRID FEATURE ENHANCEMENT METHOD FOR GLAND SEGMENTATION IN HISTOPATHOLOGY IMAGES |
5122 | A JOINT CONVOLUTIONAL AND SPATIAL QUAD-DIRECTIONAL LSTM NETWORK FOR PHASE UNWRAPPING |
2907 | A JOINT TRAINING FRAMEWORK OF MULTI-LOOK SEPARATOR AND SPEAKER EMBEDDING EXTRACTOR FOR OVERLAPPED SPEECH |
3679 | A LARGE-DIMENSIONAL ANALYSIS OF SYMMETRIC SNE |
3532 | A Large-Scale Chinese Long-text Extractive Summarization Corpus |
5332 | A Layered Embedding-Based Scheme To Cope With Intra-frame Distortion Drift In IPM-Based HEVC Steganography |
3118 | A LOW-COMPLEXITY ADMM-BASED MASSIVE MIMO DETECTORS VIA DEEP NEURAL NETWORKS |
3000 | A LOW-COMPLEXITY MIMO DUAL FUNCTION RADAR COMMUNICATION SYSTEM VIA ONE-BIT SAMPLING |
4592 | A META-LEARNING FRAMEWORK FOR FEW-SHOT CLASSIFICATION OF REMOTE SENSING SCENE |
1195 | A METHOD FOR DETERMINING PERIODICALLY TIME-VARYING BIAS AND ITS APPLICATIONS IN ACOUSTIC FEEDBACK CANCELLATION |
5625 | A MNEMONIC KALMAN FILTER FOR NON-LINEAR SYSTEMS WITH EXTENSIVE TEMPORAL DEPENDENCIES |
4678 | A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT |
4280 | A MULTI-CHANNEL TEMPORAL ATTENTION CONVOLUTIONAL NEURAL NETWORK MODEL FOR ENVIRONMENTAL SOUND CLASSIFICATION |
4970 | A Multi-layer Multi-channel Attentive Network for Gender and Age Recognition |
4443 | A MULTIPLE ACCESS CHANNEL GAME USING LATENCY METRIC |
4138 | A MULTI-VIEW APPROACH TO AUDIO-VISUAL SPEAKER VERIFICATION |
2723 | A NEURAL ACOUSTIC ECHO CANCELLER OPTIMIZED USING AN AUTOMATIC SPEECH RECOGNIZER AND LARGE SCALE SYNTHETIC DATA |
4290 | A NEURAL TEXT-TO-SPEECH MODEL UTILIZING BROADCAST DATA MIXED WITH BACKGROUND MUSIC |
2556 | A NEW AUTOMOTIVE RADAR 4D POINT CLOUDS DETECTOR BY USING DEEP LEARNING |
2568 | A NEW DCASE 2017 RARE SOUND EVENT DETECTION BENCHMARK UNDER EQUAL TRAINING DATA: CRNN WITH MULTI-WIDTH KERNELS |
5597 | A NEW DIFFUSION VARIABLE SPATIAL REGULARIZED QRRLS ALGORITHM |
1558 | A NEW FRAMEWORK BASED ON TRANSFER LEARNING FOR CROSS-DATABASE PNEUMONIA DETECTION |
2025 | A NEW HIGH QUALITY TRAJECTORY TILING BASED HYBRID TTS IN REAL TIME |
2997 | A NEW TUBULAR STRUCTURE TRACKING ALGORITHM BASED ON CURVATURE-PENALIZED PERCEPTUAL GROUPING |
3230 | A NOISE-ROBUST SIGNAL PROCESSING STRATEGY FOR COCHLEAR IMPLANTS USING NEURAL NETWORKS |
3409 | A NOVEL ATTENTION-BASED GATED RECURRENT UNIT AND ITS EFFICACY IN SPEECH EMOTION RECOGNITION |
3861 | A NOVEL BAYESIAN APPROACH FOR THE TWO-DIMENSIONAL HARMONIC RETRIEVAL PROBLEM |
4745 | A NOVEL CONVOLUTIONAL NEURAL NETWORK MODEL TO REMOVE MUSCLE ARTIFACTS FROM EEG |
4260 | A NOVEL END-TO-END SPEECH EMOTION RECOGNITION NETWORK WITH STACKED TRANSFORMER LAYERS |
4922 | A NOVEL NMF-HMM SPEECH ENHANCEMENT ALGORITHM BASED ON POISSON MIXTURE MODEL |
5267 | A NOVEL VIEWPORT-ADAPTIVE MOTION COMPENSATION TECHNIQUE FOR FISHEYE VIDEO |
4020 | A PARALLEL ALGORITHM FOR PHASE RETRIEVAL WITH DICTIONARY LEARNING |
4992 | A PARALLELIZABLE LATTICE RESCORING STRATEGY WITH NEURAL LANGUAGE MODELS |
1230 | A parametric unconstrained binaural beamformer based noise reduction and spatial cue preservation for hearing-assistive devices |
2637 | A PARTIALLY COLLAPSED GIBBS SAMPLER FOR UNSUPERVISED NONNEGATIVE SPARSE SIGNAL RESTORATION |
3146 | A PARTIALLY-RELAXED ROBUST DOA ESTIMATOR UNDER NON-GAUSSIAN LOW-RANK INTERFERENCE AND NOISE |
3653 | A PATIENT-INVARIANT MODEL FOR FREEZING OF GAIT DETECTION AIDED BY WAVELET DECOMPOSITION |
4476 | A PERIODIC FRAME LEARNING APPROACH FOR ACCURATE LANDMARK LOCALIZATION IN M-MODE ECHOCARDIOGRAPHY |
2359 | A PLUG AND PLAY FAST INTERSECTION OVER UNION LOSS FOR BOUNDARY BOX REGRESSION |
3114 | A PLUG-AND-PLAY DEEP IMAGE PRIOR |
1983 | A PROBABILISTIC MODEL FOR SEGMENTATION OF AMBIGUOUS 3D LUNG NODULE |
4673 | A PROGRESSIVE LEARNING APPROACH TO ADAPTIVE NOISE AND SPEECH ESTIMATION FOR SPEECH ENHANCEMENT AND NOISY SPEECH RECOGNITION |
4000 | A QUANTITATIVE ANALYSIS OF THE ROBUSTNESS OF NEURAL NETWORKS FOR TABULAR DATA |
4701 | A QUANTITATIVE METRIC FOR PRIVACY LEAKAGE IN FEDERATED LEARNING |
4071 | A QUATERNION-VALUED VARIATIONAL AUTOENCODER |
2031 | A RANK-CONSTRAINED CLUSTERING ALGORITHM WITH ADAPTIVE EMBEDDING |
4212 | A RANKED SIMILARITY LOSS FUNCTION WITH PAIR WEIGHTING FOR DEEP METRIC LEARNING |
4539 | A real-time speaker diarization system based on spatial spectrum |
4848 | A ReLU Dense Layer to Improve the Performance of Neural Networks |
4949 | A ROBUST AND EFFICIENT MULTI-SCALE SEASONAL-TREND DECOMPOSITION |
4993 | A ROBUST COPULA MODEL FOR RADAR-BASED LANDMINE DETECTION |
3964 | A ROBUST TO NOISE ADVERSARIAL RECURRENT MODEL FOR NON-INTRUSIVE LOAD MONITORING |
3911 | A SAMPLE-EFFICIENT SCHEME FOR CHANNEL RESOURCE ALLOCATION IN NETWORKED ESTIMATION |
4575 | A SCALE INVARIANT MEASURE OF FLATNESS FOR DEEP NETWORK MINIMA |
1753 | A SECURE SEARCHABLE IMAGE RETRIEVAL SCHEME WITH CORRECT RETRIEVAL IDENTITY |
1748 | A SEQUENTIAL CONTRASTIVE LEARNING FRAMEWORK FOR ROBUST DYSARTHRIC SPEECH RECOGNITION |
4415 | A short tutorial on the Weisfeiler-Lehman test and its variants |
3043 | A SIMPLIFIED WIENER BEAMFORMER BASED ON COVARIANCE MATRIX MODELLING |
2821 | A SPARSE CODING APPROACH TO AUTOMATIC DIET MONITORING WITH CONTINUOUS GLUCOSE MONITORS |
2350 | A STAGE MATCH FOR QUERY-BY-EXAMPLE SPOKEN TERM DETECTION BASED ON STRUCTURE INFORMATION OF QUERY |
2606 | A structure-guided and sparse-representation-based 3D seismic inversion method |
1068 | A TECHNIQUE FOR OFDM SYMBOL SLICING |
1324 | A Time-domain Convolutional Recurrent Network for Packet Loss Concealment |
2398 | A Triplet Appearance Parsing Network for Person Re-Identification |
2953 | A TWO-STAGE APPROACH TO DEVICE-ROBUST ACOUSTIC SCENE CLASSIFICATION |
3260 | A Two-Stage Deep Modeling Approach to Articulatory Inversion |
2413 | A TYLER-TYPE ESTIMATOR OF LOCATION AND SCATTER LEVERAGING RIEMANNIAN OPTIMIZATION |
2110 | A UNIFIED APPROACH TO TRANSLATE CLASSICAL BANDIT ALGORITHMS TO STRUCTURED BANDITS |
5005 | A UNIVERSAL BERT-BASED FRONT-END MODEL FOR MANDARIN TEXT-TO-SPEECH SYNTHESIS |
4306 | A WIRELESS REFERENCE ACTIVE NOISE CONTROL HEADPHONE USING COHERENCE BASED SELECTION TECHNIQUE |
2249 | ABSOLUTE 3D POSE ESTIMATION AND LENGTH MEASUREMENT OF SEVERELY DEFORMED FISH FROM MONOCULAR VIDEOS IN LONGLINE FISHING |
3024 | ACCDOA: ACTIVITY-COUPLED CARTESIAN DIRECTION OF ARRIVAL REPRESENTATION FOR SOUND EVENT LOCALIZATION AND DETECTION |
1974 | ACCELERATING AUXILIARY FUNCTION-BASED INDEPENDENT VECTOR ANALYSIS |
1518 | ACCELERATING FRANK-WOLFE WITH WEIGHTED AVERAGE GRADIENTS |
3815 | ACOUSTIC ANALYSIS AND DATASET OF TRANSITIONS BETWEEN COUPLED ROOMS |
2871 | ACOUSTIC AND LINGUISTIC ANALYSES TO ASSESS EARLY-ONSET AND GENETIC ALZHEIMER'S DISEASE |
4070 | Acoustic echo cancellation with the dual-signal transformation LSTM network |
3143 | ACOUSTIC REFLECTORS LOCALIZATION FROM STEREO RECORDINGS USING NEURAL NETWORKS |
3650 | ACOUSTICS BASED INTENT RECOGNITION USING DISCOVERED PHONETIC UNITS FOR LOW RESOURCE LANGUAGES |
4934 | ACOUSTIC-TO-ARTICULATORY INVERSION FOR DYSARTHRIC SPEECH BY USING CROSS-CORPUS ACOUSTIC-ARTICULATORY DATA |
4014 | ACTION STATE UPDATE APPROACH TO DIALOGUE MANAGEMENT |
5414 | Active Estimation from Multimodal Data |
4920 | ACTIVE PRIVACY-UTILITY TRADE-OFF AGAINST A HYPOTHESIS TESTING ADVERSARY |
3051 | ACUTE LYMPHOBLASTIC LEUKEMIA DETECTION BASED ON ADAPTIVE UNSHARPENING AND DEEP LEARNING |
1794 | ADAPTABLE ENSEMBLE DISTILLATION |
1298 | ADAPTABLE MULTI-DOMAIN LANGUAGE MODEL FOR TRANSFORMER ASR |
2362 | ADAPTIVE BI-DIRECTIONAL ATTENTION: EXPLORING MULTI-GRANULARITY REPRESENTATIONS FOR MACHINE READING COMPREHENSION |
3704 | ADAPTIVE CONTENTION WINDOW DESIGN USING DEEP Q-LEARNING |
1069 | ADAPTIVE DUAL TREE STRUCTURE FOR SCREEN CONTENT CODING |
1100 | ADAPTIVE FEATURE WEIGHT LEARNING FOR ROBUST CLUSTERING PROBLEM WITH SPARSE CONSTRAINT |
4089 | ADAPTIVE GOP SIZE DECISION FOR MULTI-PASS VIDEO CODING BASED ON HIDDEN MARKOV MODEL |
4282 | Adaptive importance sampling via auto-regressive generative models and Gaussian processes |
1715 | ADAPTIVE MULTI-DOMAIN LEARNING FOR OUTDOOR 3D HUMAN POSE AND SHAPE ESTIMATION |
4521 | ADAPTIVE QUANTIZATION OF MODEL UPDATES FOR COMMUNICATION-EFFICIENT FEDERATED LEARNING |
1041 | ADAPTIVE REAL-TIME FILTER FOR PARTIALLY-OBSERVED BOOLEAN DYNAMICAL SYSTEMS |
2507 | ADAPTIVE RE-BALANCING NETWORK WITH GATE MECHANISM FOR LONG-TAILED VISUAL QUESTION ANSWERING |
5588 | ADAPTIVE REVERBERATION ABSORPTION USING NON-STATIONARY MASKING COMPONENTS DETECTION FOR INTELLIGIBILITY IMPROVEMENT |
5030 | ADAPTIVE RF FINGERPRINT DECOMPOSITION IN MICRO UAV DETECTION BASED ON MACHINE LEARNING |
3978 | ADAPTIVE SUBSAMPLING OF MULTIDOMAIN SIGNALS WITH GRAPH PRODUCTS |
2317 | ADAPT-THEN-COMBINE FULL WAVEFORM INVERSION FOR DISTRIBUTED SUBSURFACE IMAGING IN SEISMIC NETWORKS |
4216 | ADA-SISE: ADAPTIVE SEMANTIC INPUT SAMPLING FOR EFFICIENT EXPLANATION OF CONVOLUTIONAL NEURAL NETWORKS |
3202 | ADASPEECH 2: ADAPTIVE TEXT TO SPEECH WITH UNTRANSCRIBED DATA |
1240 | ADL-MVDR: ALL DEEP LEARNING MVDR BEAMFORMER FOR TARGET SPEECH SEPARATION |
4533 | ADMM-BASED FAST ALGORITHM FOR ROBUST MULTI-GROUP MULTICAST BEAMFORMING |
3745 | ADMM-BASED ML DECODING: FROM THEORY TO PRACTICE |
3819 | ADVANCES IN MORPHOLOGICAL NEURAL NETWORKS: TRAINING, PRUNING AND ENFORCING SHAPE CONSTRAINTS |
2747 | ADVANCING RNN TRANSDUCER TECHNOLOGY FOR SPEECH RECOGNITION |
1839 | ADVERSARIAL ATTACKS ON AUDIO SOURCE SEPARATION |
2819 | ADVERSARIAL ATTACKS ON COARSE-TO-FINE CLASSIFIERS |
2990 | ADVERSARIAL ATTACKS ON OBJECT DETECTORS WITH LIMITED PERTURBATIONS |
3398 | ADVERSARIAL DEFENSE FOR AUTOMATIC SPEAKER VERIFICATION BY CASCADED SELF-SUPERVISED LEARNING MODELS |
4676 | ADVERSARIAL DEFENSE FOR DEEP SPEAKER RECOGNITION USING HYBRID ADVERSARIAL TRAINING |
2288 | Adversarial Examples Detection beyond Image Space |
4359 | ADVERSARIAL GENERATIVE DISTANCE-BASED CLASSIFIER FOR ROBUST OUT-OF-DOMAIN DETECTION |
4153 | Adversarial Learning via Probabilistic Proximity Analysis |
4520 | ADVERSARIALLY ROBUST CLASSIFICATION BASED ON GLRT |
2417 | AEC IN A NETSHELL: ON TARGET AND TOPOLOGY CHOICES FOR FCRN ACOUSTIC ECHO CANCELLATION |
3241 | AFFINE PROJECTION SUBSPACE TRACKING |
1785 | AGAIN-VC: A ONE-SHOT VOICE CONVERSION USING ACTIVATION GUIDANCE AND ADAPTIVE INSTANCE NORMALIZATION |
1729 | AGENT-ENVIRONMENT NETWORK FOR TEMPORAL ACTION PROPOSAL GENERATION |
2108 | AGE-VOX-CELEB: MULTI-MODAL CORPUS FOR FACIAL AND SPEECH ESTIMATION |
1683 | AGGREGATION ARCHITECTURE AND ALL-TO-ONE NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION |
5163 | AISPEECH-SJTU ACCENT IDENTIFICATION SYSTEM FOR THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE |
4853 | AISPEECH-SJTU ASR system for the Accented English Speech Recognition Challenge |
4351 | ALIGN OR ATTEND? TOWARD MORE EFFICIENT AND ACCURATE SPOKEN WORD DISCOVERY USING SPEECH-TO-IMAGE RETRIEVAL |
3148 | ALIGNING SETS OF TEMPORAL SIGNALS WITH RIEMANNIAN GEOMETRY AND KOOPMAN OPERATOR |
5154 | ALIGNING THE TRAINING AND EVALUATION OF UNSUPERVISED TEXT STYLE TRANSFER |
4407 | All for One and One for All: Improving Music Separation by Bridging Networks |
2066 | ALLOCATING DNN LAYERS COMPUTATION BETWEEN FRONT-END DEVICES AND THE CLOUD SERVER FOR VIDEO BIG DATA PROCESSING |
5580 | All-Pass Filter Design Using Blaschke Interpolation |
4983 | ALTERNATING PROJECTIONS GRIDLESS COVARIANCE-BASED ESTIMATION FOR DOA |
1602 | AMPLITUDE MATCHING: MAJORIZATION-MINIMIZATION ALGORITHM FOR SOUND FIELD CONTROL ONLY WITH AMPLITUDE CONSTRAINT |
2843 | An Actor-Critic Reinforcement Learning Approach to Minimum Age of Information Scheduling in Energy Harvesting Networks |
1613 | AN ADAPTIVE DISCRIMINANT AND SPARSITY FEATURE DESCRIPTOR FOR FINGER VEIN RECOGNITION |
3406 | AN ADAPTIVE MULTI-SCALE AND MULTI-LEVEL FEATURES FUSION NETWORK WITH PERCEPTUAL LOSS FOR CHANGE DETECTION |
2105 | AN ADAPTIVE NON-LINEAR PROCESS FOR UNDER-DETERMINED VIRTUAL MICROPHONE BEAMFORMING |
1288 | AN ADAPTIVE PART-BASED MODEL FOR PERSON RE-IDENTIFICATION |
2427 | AN ADAPTIVE PYRAMID SINGLE-VIEW DEPTH LOOKUP TABLE CODING METHOD |
5239 | An adaptive Regularization Approach to Portfolio Optimization |
4061 | AN ADMM BASED NETWORK FOR HYPERSPECTRAL UNMIXING TASKS |
4886 | AN ASYMPTOTICALLY POINTWISE OPTIMAL PROCEDURE FOR SEQUENTIAL JOINT DETECTION AND ESTIMATION |
4714 | AN ASYNCHRONOUS WFST-BASED DECODER FOR AUTOMATIC SPEECH RECOGNITION |
1322 | AN ATTENTION BASED WAVELET CONVOLUTIONAL MODEL FOR VISUAL SALIENCY DETECTION |
3812 | AN ATTENTION MODEL FOR HYPERNASALITY PREDICTION IN CHILDREN WITH CLEFT PALATE |
2202 | AN ATTENTION-SEQ2SEQ MODEL BASED ON CRNN ENCODING FOR AUTOMATIC LABANOTATION GENERATION FROM MOTION CAPTURE DATA |
3757 | AN EFFECTIVE DEEP EMBEDDING LEARNING METHOD BASED ON DENSE-RESIDUAL NETWORKS FOR SPEAKER VERIFICATION |
2348 | AN EFFICIENT ACTIVE SET ALGORITHM FOR COVARIANCE BASED JOINT DATA AND ACTIVITY DETECTION FOR MASSIVE RANDOM ACCESS WITH MASSIVE MIMO |
2810 | An Efficient Algorithm for Device Detection and Channel Estimation in Asynchronous IoT Systems |
2957 | AN EFFICIENT ALTERNATING DIRECTION METHOD FOR GRAPH LEARNING FROM SMOOTH SIGNALS |
4349 | AN EFFICIENT LINEAR PROGRAMMING ROUNDING-AND-REFINEMENT ALGORITHM FOR LARGE-SCALE NETWORK SLICING PROBLEM |
2820 | AN EFFICIENT PAPER ANTI-COUNTERFEITING METHOD BASED ON MICROSTRUCTURE ORIENTATION ESTIMATION |
3917 | AN EMPIRICAL STUDY OF END-TO-END SIMULTANEOUS SPEECH TRANSLATION DECODING STRATEGIES |
1965 | AN EMPIRICAL STUDY OF VISUAL FEATURES FOR DNN BASED AUDIO-VISUAL SPEECH ENHANCEMENT IN MULTI-TALKER ENVIRONMENTS |
5135 | An Empirical Study on Task-Oriented Dialogue Translation |
3997 | An End-to-End Actor-Critic-Based Neural Coreference Resolution System |
5619 | An End-to-End Dense-InceptionNet for Image Copy-Move Forgery Detection |
3995 | AN END-TO-END NON-INTRUSIVE MODEL FOR SUBJECTIVE AND OBJECTIVE REAL-WORLD SPEECH ASSESSMENT USING A MULTI-TASK FRAMEWORK |
3307 | AN END-TO-END SPEECH ACCENT RECOGNITION METHOD BASED ON HYBRID CTC/ATTENTION TRANSFORMER ASR |
5585 | AN ENHANCED SPATIAL SMOOTHING TECHNIQUE WITH ESPRIT ALGORITHM FOR DIRECTION OF ARRIVAL ESTIMATION IN COHERENT SCENARIOS |
2777 | AN EXTENSION OF SPARSE AUDIO DECLIPPER TO MULTIPLE MEASUREMENT VECTORS |
4445 | An F Test for Polynomial Frequency Modulation |
2325 | AN HRNET-BLSTM MODEL WITH TWO-STAGE TRAINING FOR SINGING MELODY EXTRACTION |
3595 | An Improved Data Driven Dynamic SIRD model for Predictive Monitoring of COVID-19 |
3428 | AN IMPROVED DEEP RELATION NETWORK FOR ACTION RECOGNITION IN STILL IMAGES |
4224 | An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection |
3272 | AN IMPROVED MEAN TEACHER BASED METHOD FOR LARGE SCALE WEAKLY LABELED SEMI-SUPERVISED SOUND EVENT DETECTION |
5280 | AN INVESTIGATION OF END-TO-END MODELS FOR ROBUST SPEECH RECOGNITION |
3036 | AN INVESTIGATION OF USING HYBRID MODELING UNITS FOR IMPROVING END-TO-END SPEECH RECOGNITION SYSTEM |
4352 | AN ITERATIVE FRAMEWORK FOR SELF-SUPERVISED DEEP SPEAKER REPRESENTATION LEARNING |
1911 | An Optimal Stochastic Compositional Optimization Method with Applications to Meta Learning |
4315 | AN ORDER-OPTIMAL ADAPTIVE TEST PLAN FOR NOISY GROUP TESTING UNDER UNKNOWN NOISE MODELS |
5274 | ANALOG BEAMFORMING WITH ANTENNA SELECTION FOR LARGE-SCALE ANTENNA ARRAYS |
3425 | ANALYSING BIAS IN SPOKEN LANGUAGE ASSESSMENT USING CONCEPT ACTIVATION VECTORS |
5584 | Analysis and Detection of Pathological Voice Using Glottal Source Features |
3922 | ANALYSIS OF THE BUT DIARIZATION SYSTEM FOR VOXCONVERSE CHALLENGE |
5342 | ANALYSIS OF X-VECTORS FOR LOW-RESOURCE SPEECH RECOGNITION |
4294 | ANGLE–OF–ARRIVAL (AOA) FACTORIZATION IN MULTIPATH CHANNELS |
4335 | ANTENNA SELECTION FOR MASSIVE MIMO SYSTEMS BASED ON POMDP FRAMEWORK |
1384 | ANY-TO-ONE SEQUENCE-TO-SEQUENCE VOICE CONVERSION USING SELF-SUPERVISED DISCRETE SPEECH REPRESENTATIONS |
3243 | APPLICATION-LAYER DDOS ATTACKS WITH MULTIPLE EMULATION DICTIONARIES |
3063 | APPLIED METHODS FOR SPARSE SAMPLING OF HEAD-RELATED TRANSFER FUNCTIONS |
1474 | APPROXIMATE WEIGHTED CR CODED MATRIX MULTIPLICATION |
2067 | ARRAYS OF FIRST-ORDER STEERABLE DIFFERENTIAL MICROPHONES |
4647 | ARRHYTHMIA CLASSIFICATION WITH HEARTBEAT-AWARE TRANSFORMER |
3223 | ARTIFICIALLY SYNTHESISING DATA FOR AUDIO CLASSIFICATION AND SEGMENTATION TO IMPROVE SPEECH AND MUSIC DETECTION IN RADIO BROADCAST |
4140 | ASR n-best Fusion Nets |
1435 | ASSESSMENT OF BIPOLAR DISORDER USING HETEROGENEOUS DATA OF SMARTPHONE-BASED DIGITAL PHENOTYPING |
2092 | Assisted Learning: Cooperative AI with Autonomy |
5200 | ASV-SUBTOOLS: OPEN SOURCE TOOLKIT FOR AUTOMATIC SPEAKER VERIFICATION |
4693 | ASYMPTOTIC DISTRIBUTION OF GENERALIZED LIKELIHOOD RATIO TEST UNDER MODEL MISSPECIFICATION WITH APPLICATION TO COOPERATIVE RADAR-COMMUNICATIONS |
1004 | ASYNCHRONOUS ACOUSTIC ECHO CANCELLATION OVER WIRELESS CHANNELS |
5375 | ATTACK ON PRACTICAL SPEAKER VERIFICATION SYSTEM USING UNIVERSAL ADVERSARIAL PERTURBATIONS |
2318 | ATTACKING AND DEFENDING BEHIND A PSYCHOACOUSTICS-BASED CAPTCHA |
3503 | ATTENTION ENHANCED SPATIAL TEMPORAL NEURAL NETWORK FOR HRRP RECOGNITION |
4069 | ATTENTION IS ALL YOU NEED IN SPEECH SEPARATION |
5263 | ATTENTION ON ATTENTION SPARSE DENSE CONVOLUTIONAL NETWORK FOR FINANCIAL SIGNAL PROCESSING |
2919 | ATTENTION-BASED MULTI-ENCODER AUTOMATIC PRONUNCIATION ASSESSMENT |
4366 | ATTENTION-EMBEDDED DECOMPOSED NETWORK WITH UNPAIRED CT IMAGES PRIOR FOR METAL ARTIFACT REDUCTION |
1665 | ATTENTION-GUIDED SECOND-ORDER POOLING CONVOLUTIONAL NETWORKS |
4188 | ATTENTIONLITE: TOWARDS EFFICIENT SELF-ATTENTION MODELS FOR VISION |
1746 | ATTENTIVE SEMANTIC EXPLORING FOR MANIPULATED FACE DETECTION |
3481 | ATTRIBUTE DECOMPOSITION FOR FLOW-BASED DOMAIN MAPPING |
5048 | ATVIO: ATTENTION GUIDED VISUAL-INERTIAL ODOMETRY |
5279 | AUDIO DEQUANTIZATION USING (CO)SPARSE (NON)CONVEX METHODS |
5621 | Audio Replay Spoof Attack Detection by Joint Segment-Based Linear Filter Bank Feature Extraction and Attention-Enhanced DenseNet-BiLSTM Network |
3824 | Audio-Visual Event Recognition through the lens of Adversary |
4160 | AUDIOVISUAL HIGHLIGHT DETECTION IN VIDEOS |
4968 | AUDIO-VISUAL SPEECH ENHANCEMENT METHOD CONDITIONED ON THE LIP MOTION AND SPEAKER-DISCRIMINATIVE EMBEDDINGS |
1217 | AUDIO-VISUAL SPEECH INPAINTING WITH DEEP LEARNING |
4178 | AUDIO-VISUAL SPEECH SEPARATION USING CROSS-MODAL CORRESPONDENCE LOSS |
1948 | AUDITORY FILTERBANKS BENEFIT UNIVERSAL SOUND SOURCE SEPARATION |
4302 | AUGMENTED GAUSSIAN LINEAR MIXTURE MODEL FOR SPECTRAL VARIABILITY IN HYPERSPECTRAL UNMIXING |
5185 | AUGMENTING TRANSFERRED REPRESENTATIONS FOR STOCK CLASSIFICATION |
3724 | Autoencoder for Vibrotactile Signal Compression |
1706 | AutoKWS: Keyword Spotting with Differentiable Architecture Search |
4203 | AUTOMATED MULTI-ORGAN SEGMENTATION IN PET IMAGES USING CASCADED TRAINING OF A 3D U-NET AND CONVOLUTIONAL AUTOENCODER |
5074 | AUTOMATIC AND PERCEPTUAL DISCRIMINATION BETWEEN DYSARTHRIA, APRAXIA OF SPEECH, AND NEUROTYPICAL SPEECH |
4772 | AUTOMATIC DYSARTHRIC SPEECH DETECTION EXPLOITING PAIRWISE DISTANCE-BASED CONVOLUTIONAL NEURAL NETWORKS |
4831 | AUTOMATIC ELICITATION COMPLIANCE FOR SHORT-DURATION SPEECH BASED DEPRESSION DETECTION |
4424 | Automatic Fine-grained Localization of Utility Pole Landmarks on Distributed Acoustic Sensing Traces Based on Bilinear ResNets |
2521 | AUTOMATIC MULTITRACK MIXING WITH A DIFFERENTIABLE MIXING CONSOLE OF NEURAL AUDIO EFFECTS |
2771 | AUTOMATIC ORDER SELECTION IN AUTOREGRESSIVE MODELING WITH APPLICATION IN EEG SLEEP-STAGE CLASSIFICATION |
4015 | AUTOMATIC REGISTRATION AND CLUSTERING OF TIME SERIES |
2270 | AUTOREGRESSIVE FAST MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR JOINT BLIND SOURCE SEPARATION AND DEREVERBERATION |
5594 | AUTO-TUNING SPECTRAL CLUSTERING FOR SPEAKER DIARIZATION USING NORMALIZED MAXIMUM EIGENGAP |
3369 | BACKDOOR ATTACK AGAINST SPEAKER VERIFICATION |
1129 | BAITRADAR: A MULTI-MODEL CLICKBAIT DETECTION ALGORITHM USING DEEP LEARNING |
4656 | BANDWIDTH EXTENSION IS ALL YOU NEED |
1134 | BanRAW: Band-Limited Radar Waveform Design via Phase Retrieval |
1659 | BAYESIAN ESTIMATION OF A TAIL-INDEX WITH MARGINALIZED THRESHOLD |
3981 | Bayesian Massive MIMO Channel Estimation with Parameter Estimation using Low-Resolution ADCs |
3215 | BAYESIAN MULTIPLE CHANGE-POINT DETECTION OF PROPAGATING EVENTS |
1539 | BAYESIAN TRANSFORMER LANGUAGE MODELS FOR SPEECH RECOGNITION |
4189 | BAYES-OPTIMAL METHODS FOR FINDING THE SOURCE OF A CASCADE |
2758 | Beam Focusing for Multi-User MIMO Communications with Dynamic Metasurface Antennas |
5380 | BEAMFORMING FOR BIDIRECTIONAL MIMO FULL DUPLEX UNDER THE JOINT SUM POWER AND PER ANTENNA POWER CONSTRAINTS |
5075 | BENIGN OVERFITTING IN BINARY CLASSIFICATION OF GAUSSIAN MIXTURES |
4050 | BI-APC: BIDIRECTIONAL AUTOREGRESSIVE PREDICTIVE CODING FOR UNSUPERVISED PRE-TRAINING AND ITS APPLICATION TO CHILDREN’S ASR |
1740 | BIDIRECTIONAL FOCUSED SEMANTIC ALIGNMENT ATTENTION NETWORK FOR CROSS-MODAL RETRIEVAL |
4381 | BIFOCAL NEURAL ASR: EXPLOITING KEYWORD SPOTTING FOR INFERENCE OPTIMIZATION |
5153 | BI-LEVEL STYLE AND PROSODY DECOUPLING MODELING FOR PERSONALIZED END-TO-END SPEECH SYNTHESIS |
2544 | BINARY CONTROL AND DIGITAL-TO-ANALOG CONVERSION USING COMPOSITE NUV PRIORS AND ITERATIVE GAUSSIAN MESSAGE PASSING |
4458 | BISHIFT-NET FOR IMAGE INPAINTING |
2023 | Bit Constrained Communication Receivers in Joint Radar Communications Systems |
1624 | BLEND-RES^2NET: BLENDED REPRESENTATION SPACE BY TRANSFORMATION OF RESIDUAL MAPPING WITH RESTRAINED LEARNING FOR TIME SERIES CLASSIFICATION |
2949 | BLIND AMPLITUDE ESTIMATION OF EARLY ROOM REFLECTIONS USING ALTERNATING LEAST SQUARES |
1840 | BLIND AND NEURAL NETWORK-GUIDED CONVOLUTIONAL BEAMFORMER FOR JOINT DENOISING, DEREVERBERATION, AND SOURCE SEPARATION |
2188 | Blind Carbon Copy on Dirty Paper: Seamless Spectrum Underlay via Canonical Correlation Analysis |
5260 | Blind Deinterleaving of Signals in Time Series with Self-attention Based Soft Min-cost Flow Learning |
3324 | BLIND EXTRACTION OF MOVING AUDIO SOURCE IN A CHALLENGING ENVIRONMENT SUPPORTED BY SPEAKER IDENTIFICATION VIA X-VECTORS |
4008 | Blind Extraction of Moving Sources via Independent Component and Vector Analysis: Examples |
4669 | BLIND IMAGE QUALITY EVALUATOR WITH SCALE ROBUSTNESS |
4016 | BLOCK KALMAN FILTER: AN ASYMPTOTIC BLOCK PARTICLE FILTER IN THE LINEAR GAUSSIAN CASE |
2908 | BLSTM-BASED CONFIDENCE ESTIMATION FOR END-TO-END SPEECH RECOGNITION |
1164 | Bluetooth Low Energy and CNN-based Angle of Arrival Localization in Presence of Rayleigh Fading |
1123 | BOOSTING LOW-RESOURCE INTENT DETECTION WITH IN-SCOPE PROTOTYPICAL NETWORKS |
3068 | Branchy-GNN: a Device-Edge Co-Inference Framework for Efficient Point Cloud Processing |
5142 | Bridging Unpaired Facial Photos and Sketches by Line-drawings |
4168 | B-SMALL: A BAYESIAN NEURAL NETWORK APPROACH TO SPARSE MODEL-AGNOSTIC META-LEARNING |
2744 | BW-EDA-EEND: STREAMING END-TO-END NEURAL SPEAKER DIARIZATIONFOR A VARIABLE NUMBER OF SPEAKERS |
5617 | BYRDIE: BYZANTINE-RESILIENT DISTRIBUTED COORDINATE DESCENT FOR DECENTRALIZED LEARNING |
3719 | BYTECOVER: COVER SONG IDENTIFICATION VIA MULTI-LOSS TRAINING |
3514 | Byzantine-Resilient Decentralized TD Learning with Linear Function Approximation |
4618 | CAM: Context-Aware Masking for Robust Speaker Verification |
2135 | CAMERA CALIBRATION WITH POSE GUIDANCE |
2639 | CAMP: A TWO-STAGE APPROACH TO MODELLING PROSODY IN CONTEXT |
1687 | CANET: CONTEXT-AWARE LOSS FOR DESCRIPTOR LEARNING |
5096 | Canonical Polyadic Tensor Decomposition with Low-Rank Factor Matrices |
4761 | CAPTURING BANDING IN IMAGES: DATABASE CONSTRUCTION AND OBJECTIVE ASSESSMENT |
3859 | CAPTURING MULTI-RESOLUTION CONTEXT BY DILATED SELF-ATTENTION |
3508 | CAPTURING TEMPORAL DEPENDENCIES THROUGH FUTURE PREDICTION FOR CNN-BASED AUDIO CLASSIFIERS |
2926 | Cascade Attention Fusion for Fine-grained Image Captioning based on Multi-layer LSTM |
5206 | CASCADED ALL-PASS FILTERS WITH RANDOMIZED CENTER FREQUENCIES AND PHASE POLARITY FOR ACOUSTIC AND SPEECH MEASUREMENT AND DATA AUGMENTATION |
3784 | Cascaded encoders for unifying streaming and non-streaming ASR |
2406 | CASCADED MODELS WITH CYCLIC FEEDBACK FOR DIRECT SPEECH TRANSLATION |
4130 | CASCADED TIME + TIME-FREQUENCY UNET FOR SPEECH ENHANCEMENT: JOINTLY ADDRESSING CLIPPING, CODEC DISTORTIONS, AND GAPS |
4525 | CASS-NAT: CTC ALIGNMENT-BASED SINGLE STEP NON-AUTOREGRESSIVE TRANSFORMER FOR SPEECH RECOGNITION |
5484 | CATILOC: CAMERA IMAGE TRANSFORMER FOR INDOOR LOCALIZATION |
3982 | CDPAM: Contrastive learning for perceptual audio similarity |
2794 | Centrality based number of cluster estimation in Graph clustering |
2855 | CGAN-NET: CLASS-GUIDED ASYMMETRIC NON-LOCAL NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION |
1707 | CHANNEL ATTENTION RESIDUAL U-NET FOR RETINAL VESSEL SEGMENTATION |
4852 | CHANNEL-WISE MIX-FUSION DEEP NEURAL NETWORKS FOR ZERO-SHOT LEARNING |
1797 | CHARACTERIZATION OF MEMS MICROPHONE SENSITIVITY AND PHASE DISTRIBUTIONS WITH APPLICATIONS IN ARRAY PROCESSING |
3617 | CHECKING PRNU USABILITY ON MODERN DEVICES |
3605 | CIF-BASED COLLABORATIVE DECODING FOR END-TO-END CONTEXTUAL SPEECH RECOGNITION |
1445 | CLASS AWARE ROBUST TRAINING |
3989 | Class-Conditional Defense GAN Against End-to-End Speech Attacks |
2372 | CLASSIFICATION OF EXPERT-NOVICE LEVEL USING EYE TRACKING AND MOTION DATA VIA CONDITIONAL MULTIMODAL VARIATIONAL AUTOENCODER |
2273 | CLASSIFYING SPEECH INTELLIGIBILITY LEVELS OF CHILDREN IN TWO CONTINUOUS SPEECH STYLES |
4393 | CLASS-IMBALANCED CLASSIFIERS USING ENSEMBLES OF GAUSSIAN PROCESSES AND GAUSSIAN PROCESS LATENT VARIABLE MODELS |
4636 | CLOSE-TALKING RECORDING WITH PLANARLY DISTRIBUTED MICROPHONES |
1166 | CLUSTERING A COLLECTION OF NETWORKS WITH MIXTURES OF L1-SPARSE GRAPHICAL MODELS |
2155 | CMIM: CROSS-MODAL INFORMATIONMAXIMIZATION FORMEDICAL IMAGING |
4002 | CNN-BASED SPOKEN TERM DETECTION AND LOCALIZATION WITHOUT DYNAMIC PROGRAMMING |
3246 | COARSE-TO-CAREFUL: SEEKING SEMANTIC-RELATED KNOWLEDGE FOR OPEN-DOMAIN COMMONSENSE QUESTION ANSWERING |
5292 | CO-ATTENTIONAL TRANSFORMERS FOR STORY-BASED VIDEO UNDERSTANDING |
5485 | CO-CAPSULE NETWORKS BASED KNOWLEDGE TRANSFER FOR CROSS-DOMAIN RECOMMENDATION |
3873 | CODEBOOK DESIGN FOR DUAL-POLARIZED ULTRA-MASSIVE MIMO COMMUNICATIONS AT MILLIMETER WAVE AND TERAHERTZ BANDS |
1621 | Code-Switch Speech Rescoring With Monolingual Data |
4104 | Cognitive Memory Constrained Human Decision Making based on Multi-source Information |
4873 | COLD START REVISITED: A DEEP HYBRID RECOMMENDER WITH COLD-WARM ITEM HARMONIZATION |
1291 | COLLABORATIVE INFERENCE VIA ENSEMBLES ON THE EDGE |
3984 | COLLABORATIVE INTELLIGENCE: CHALLENGES AND OPPORTUNITIES |
1827 | COLLABORATIVE LEARNING TO GENERATE AUDIO-VIDEO JOINTLY |
1292 | COMBINED DIFFERENTIAL BEAMFORMING WITH UNIFORM LINEAR MICROPHONE ARRAYS |
3596 | COMBINING ADAPTIVE FILTERING AND COMPLEX-VALUED DEEP POSTFILTERING FOR ACOUSTIC ECHO CANCELLATION |
1562 | COMBINING DYNAMIC IMAGE AND PREDICTION ENSEMBLE FOR CROSS-DOMAIN FACE ANTI-SPOOFING |
1154 | COMMUNICATION OVER BLOCK FADING CHANNELS - AN ALGORITHMIC PERSPECTIVE ON OPTIMAL TRANSMISSION SCHEMES |
4759 | COMMUNICATION-COST AWARE MICROPHONE SELECTION FOR NEURAL SPEECH ENHANCEMENT WITH AD-HOC MICROPHONE ARRAYS |
1623 | COMPACT GRAPH ARCHITECTURE FOR SPEECH EMOTION RECOGNITION |
2999 | COMPARATIVE STUDY OF DIFFERENT EPOCH EXTRACTION METHODS FOR SPEECH ASSOCIATED WITH VOICE DISORDERS |
3941 | COMPARISON OF DEEP CO-TRAINING AND MEAN-TEACHER APPROACHES FOR SEMI-SUPERVISED AUDIO TAGGING |
5610 | Comparison of Wavelet and RID-Rihaczek Based Methods for Phase-Amplitude Coupling |
2587 | COMPLEX RATIO MASKING FOR SINGING VOICE SEPARATION |
4997 | COMPLEX-VALUED VS. REAL-VALUED NEURAL NETWORKS FOR CLASSIFICATION PERSPECTIVES: AN EXAMPLE ON NON-CIRCULAR DATA |
2674 | COMPOSITIONAL EMBEDDING MODELS FOR SPEAKER IDENTIFICATION AND DIARIZATION WITH SIMULTANEOUS SPEECH FROM 2+ SPEAKERS |
4938 | COMPRESSED REPRESENTATION OF CEPSTRAL COEFFICIENTS VIA RECURRENT NEURAL NETWORKS FOR INFORMED SPEECH ENHANCEMENT |
2129 | COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT SPEECH ENHANCEMENT |
1811 | COMPRESSING LOCAL DESCRIPTOR MODELS FOR MOBILE APPLICATIONS |
2424 | COMPRESSIVE SIGNAL RECOVERY UNDER SENSING MATRIX ERRORS COMBINED WITH UNKNOWN MEASUREMENT GAINS |
2985 | COMPRESSIVE WIDEBAND SPECTRUM SENSING AND CARRIER FREQUENCY ESTIMATION WITH UNKNOWN MIMO CHANNELS |
4013 | COMPUTATIONALLY EFFICIENT DNN-BASED APPROXIMATION OF AN AUDITORY MODEL FOR APPLICATIONS IN SPEECH PROCESSING |
3877 | CONFIDENCE ESTIMATION FOR ATTENTION-BASED SEQUENCE-TO-SEQUENCE MODELS FOR SPEECH RECOGNITION |
5626 | Consensus Based Distributed Spectral Radius Estimation |
2946 | Constant approximation algorithm for minimizing concave impurity |
3069 | CONSTRAINED TENSOR DECOMPOSITION FOR 2D DOA ESTIMATION IN TRANSMIT BEAMSPACE MIMO RADAR WITH SUBARRAYS |
1351 | CONSTRUCTION OF A LARGE-SCALE JAPANESE ASR CORPUS ON TV RECORDINGS |
1846 | CONSTRUCTION OF UNIT-NORM TIGHT FRAME BASED PRECONDITIONER FOR SPARSE CODING |
4653 | CONTACT TRACING ENHANCES THE EFFICIENCY OF COVID-19 GROUP TESTING |
4112 | Content-Aware Speaker Embeddings for Speaker Diarisation |
3949 | CONTEXT-AWARE PROSODY CORRECTION FOR TEXT-BASED SPEECH EDITING |
3568 | Context-Aware Speech Stress Detection in Hospital Workers Using Bi-LSTM Classifiers |
2150 | CONTINUOUS CNN FOR NONUNIFORM TIME SERIES |
5149 | CONTINUOUS FACE AGING GENERATIVE ADVERSARIAL NETWORKS |
1894 | CONTINUOUS SPEECH SEPARATION WITH CONFORMER |
3525 | CONTINUOUS-TIME SELF-ATTENTION IN NEURAL DIFFERENTIAL EQUATION |
5171 | CONTRASTIVE EMBEDDIND LEARNING METHOD FOR RESPIRATORY SOUND CLASSIFICATION |
2114 | CONTRASTIVE LEARNING OF GENERAL-PURPOSE AUDIO REPRESENTATIONS |
5300 | Contrastive Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised Learning of Disentangled Speech Representations |
2553 | Contrastive Self-supervised Learning for Text-independent Speaker Verification |
5025 | CONTRASTIVE SELF-SUPERVISED LEARNING FOR WIRELESS POWER CONTROL |
4042 | CONTRASTIVE SEMI-SUPERVISED LEARNING FOR ASR |
5179 | CONTRASTIVE SEPARATIVE CODING FOR SELF-SUPERVISED REPRESENTATION LEARNING |
3434 | CONTRASTIVE UNSUPERVISED LEARNING FOR SPEECH EMOTION RECOGNITION |
1514 | CONTROL ARCHITECTURE OF THE DOUBLE-CROSS-CORRELATION PROCESSOR FOR SAMPLING-RATE-OFFSET ESTIMATION IN ACOUSTIC SENSOR NETWORKS |
2116 | CONTROLLED TESTING AND ISOLATION FOR SUPPRESSING COVID-19 |
3490 | CONVERGENCE ANALYSIS OF THE GRAPH-TOPOLOGY-INFERENCE KERNEL LMS ALGORITHM |
3735 | CONVERSATIONAL QUERY REWRITING WITH SELF-SUPERVISED LEARNING |
5131 | CONVEX NEURAL AUTOREGRESSIVE MODELS: TOWARDS TRACTABLE, EXPRESSIVE, AND THEORETICALLY-BACKED MODELS FOR SEQUENTIAL FORECASTING AND GENERATION |
4166 | CONVOLUTIONAL DROPOUT AND WORDPIECE AUGMENTATION FOR END-TO-END SPEECH RECOGNITION |
3298 | CONVOLUTIONAL NEURAL NETWORK-AIDED BIT-FLIPPING FOR BELIEF PROPAGATION DECODING OF POLAR CODES |
3052 | CONVOLUTIVE TRANSFER FUNCTION INVARIANT SDR TRAINING CRITERIA FOR MULTI-CHANNEL REVERBERANT SPEECH SEPARATION |
5591 | COOPERATIVE PARAMETER ESTIMATION ON THE UNIT SPHERE USING A NETWORK OF DIFFUSION PARTICLE FILTERS |
1179 | COOPERATIVE PARAMETER TRACKING ON THE UNIT SPHERE USING DISTRIBUTED ADAPT-THEN-COMBINE PARTICLE FILTERS AND PARALLEL TRANSPORT |
2733 | Cooperative Scenarios For Multi-agent Reinforcement learning In Wireless Edge Caching |
3066 | COOPNET: MULTI-MODAL COOPERATIVE GENDER PREDICTION IN SOCIAL MEDIA USER PROFILING |
4110 | CopyPaste: An Augmentation Method for Speech Emotion Recognition |
2093 | CORRELATION-BASED ROBUST LINEAR REGRESSION WITH ITERATIVE OUTLIER REMOVAL |
5537 | CORRUPTED CONTEXTUAL BANDITS: ONLINE LEARNING WITH CORRUPTED CONTEXT |
2960 | COST AFFINITY LEARNING NETWORK FOR STEREO MATCHING |
2745 | COUGHWATCH: REAL-WORLD COUGH DETECTION USING SMARTWATCHES |
5479 | COUNT AND SEPARATE: INCORPORATING SPEAKER COUNTING FOR CONTINUOUS SPEAKER SEPARATION |
5164 | COUNT SKETCH WITH ZERO CHECKING: EFFICIENT RECOVERY OF HEAVY COMPONENTS |
4685 | crank: an open-source software for nonparallel voice conversion based on vector-quantized variational autoencoder |
2659 | Cross Scene Video Foreground Segmentation via Co-occurrence Probability Oriented Supervised and Unsupervised Model Interaction |
4878 | Cross-Corpus Speech Emotion Recognition Using Joint Distribution Adaptive Regression |
3360 | Cross-Domain Semi-Supervised Deep Metric Learning for Image Sentiment Analysis |
5110 | Cross-Domain Sentiment Classification With Contrastive Learning and Mutual Information Maximization |
1572 | CROSS-MODAL KNOWLEDGE DISTILLATION FOR FINE-GRAINED ONE-SHOT CLASSIFICATION |
1491 | Cross-Modal Representation Reconstruction for Zero-Shot Classification |
1081 | CROSS-MODAL SPECTRUM TRANSFORMATION NETWORK FOR ACOUSTIC SCENE CLASSIFICATION |
5033 | CROSS-SILO FEDERATED TRAINING IN THE CLOUD WITH DIVERSITY SCALING AND SEMI-SUPERVISED LEARNING |
3498 | CROSS-TEAGER ENERGY CEPSTRAL COEFFICIENTS FOR REPLAY SPOOF DETECTION ON VOICE ASSISTANTS |
1759 | Crowd Counting via multi-level regression with Latent Gaussian maps |
4622 | Crowdsourcing approach for subjective evaluation of echo impairment |
3747 | CRYPTO-ORIENTED NEURAL ARCHITECTURE DESIGN |
2648 | CT-CAPS: Feature Extraction-based Automated Framework for COVID-19 Disease Identification from Chest CT Scans using Capsule Networks |
3787 | CUE-PRESERVING MMSE FILTER WITH BAYESIAN SNR MARGINALIZATION FOR BINAURAL SPEECH ENHANCEMENT |
2615 | CYCLE GENERATIVE ADVERSARIAL NETWORK APPROACHES TO PRODUCE NOVEL PORTABLE CHEST X-RAYS IMAGES FOR COVID-19 DIAGNOSIS |
4597 | DAG-GAN: CAUSAL STRUCTURE LEARNING WITH GENERATIVE ADVERSARIAL NETS |
1802 | DATA AUGMENTATION WITH SIGNAL COMPANDING FOR DETECTION OF LOGICAL ACCESS ATTACKS |
2535 | DATA DISCOVERY USING LOSSLESS COMPRESSION-BASED SPARSE REPRESENTATION |
3828 | DATA FUSION FOR AUDIOVISUAL SPEAKER LOCALIZATION: EXTENDING DYNAMIC STREAM WEIGHTS TO THE SPATIAL DOMAIN |
1297 | Data-Driven Adaptive Network Resource Slicing for Multi-Tenant Networks |
3624 | DATA-EFFICIENT FRAMEWORK FOR REAL-WORLD MULTIPLE SOUND SOURCE 2D LOCALIZATION |
3123 | DBNET: DOA-DRIVEN BEAMFORMING NETWORK FOR END-TO-END REVERBERANT SOUND SOURCE SEPARATION |
1345 | DCASENET: AN INTEGRATED PRETRAINED DEEP NEURAL NETWORK FOR DETECTING AND CLASSIFYING ACOUSTIC SCENES AND EVENTS |
3722 | DEAAN: DISENTANGLED EMBEDDING AND ADVERSARIAL ADAPTATION NETWORK FOR ROBUST SPEAKER REPRESENTATION LEARNING |
2986 | Decentralized Deep Learning using Momentum-Accelerated Consensus |
4613 | Decentralized motion inference and registration of Neuropixel data |
4252 | DECENTRALIZED OPTIMIZATION ON TIME-VARYING DIRECTED GRAPHS UNDER COMMUNICATION CONSTRAINTS |
4654 | DECENTRALIZED OPTIMIZATION OVER NOISY, RATE-CONSTRAINED NETWORKS: HOW WE AGREE BY TALKING ABOUT HOW WE DISAGREE |
1396 | Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition |
3204 | DECISION TREE BASED INTER PARTITION TERMINATION FOR AV1 ENCODING |
4548 | DECODING MUSIC ATTENTION FROM "EEG HEADPHONES": A USER-FRIENDLY AUDITORY BRAIN-COMPUTER INTERFACE |
2001 | Decoding neural representations of rhythmic sounds from magnetoencephalography |
3602 | DECOMPOSING TEXTURES USING EXPONENTIAL ANALYSIS |
2185 | Decouple the High-Frequency and Low-Frequency Information of Images for Semantic Segmentation |
4894 | DECOUPLING PRONUNCIATION AND LANGUAGE FOR END-TO-END CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION |
1397 | DEEP ACTIVE LEARNING APPROACH TO ADAPTIVE BEAMFORMING FOR MMWAVE INITIAL ALIGNMENT |
2224 | DEEP ADVERSARIAL QUANTIZATION NETWORK FOR CROSS-MODAL RETRIEVAL |
5620 | Deep and Ordinal Ensemble Learning for Human Age Estimation From Facial Images |
1931 | DEEP AUTO-ENCODING AND BIOHASHING FOR SECURE FINGER VEIN RECOGNITION |
3282 | DEEP COLOR CONSTANCY USING TEMPORAL GRADIENT UNDER AC LIGHT SOURCES |
5039 | DEEP CONVOLUTIONAL AND RECURRENT NETWORKS FOR POLYPHONIC INSTRUMENT CLASSIFICATION FROM MONOPHONIC RAW AUDIO WAVEFORMS |
4120 | Deep Convolutional Gaussian Processes for mmWave Outdoor Localization |
3749 | DEEP DETERMINISTIC INFORMATION BOTTLENECK WITH MATRIX-BASED ENTROPY FUNCTIONAL |
3693 | DEEP ENSEMBLE SIAMESE NETWORK FOR INCREMENTAL SIGNAL CLASSIFICATION |
1097 | DEEP GENERATIVE DEMIXING: ERROR BOUNDS FOR DEMIXING SUBGAUSSIAN MIXTURES OF LIPSCHITZ SIGNALS |
2142 | DEEP GENERATIVE MODEL LEARNING FOR BLIND SPECTRUM CARTOGRAPHY WITH NMF-BASED RADIO MAP DISAGGREGATION |
4460 | DEEP HASHING FOR MOTION CAPTURE DATA RETRIEVAL |
4697 | DEEP LEARNING ARCHITECTURAL DESIGNS FOR SUPER-RESOLUTION OF NOISY IMAGES |
4847 | DEEP LEARNING BASED HYBRID PRECODING IN DUAL-BAND COMMUNICATION SYSTEMS |
1841 | Deep Learning for Linear Inverse Problems Using the Plug-and-Play Priors Framework |
5035 | DEEP LEARNING-BASED CROSS-LAYER RESOURCE ALLOCATION FOR WIRED COMMUNICATION SYSTEMS |
3368 | DEEP LUNG AUSCULTATION USING ACOUSTIC BIOMARKERS FOR ABNORMAL RESPIRATORY SOUND EVENT DETECTION |
5066 | DEEP MULTI-FRAME MVDR FILTERING FOR SINGLE-MICROPHONE SPEECH ENHANCEMENT |
4856 | Deep Multiway Canonical Correlation Analysis for Multi-subject EEG Normalization |
2712 | Deep Neural Network based Cough Detection using Bed-mounted Accelerometer Measurements |
3505 | DEEP NEURAL NETWORK EMBEDDINGS FOR THE ESTIMATION OF THE DEGREE OF SLEEPINESS |
4526 | DEEP NEURAL NETWORKS WITH FLEXIBLE COMPLEXITY WHILE TRAINING BASED ON NEURAL ORDINARY DIFFERENTIAL EQUATIONS |
2394 | DEEP RESIDUAL ECHO SUPPRESSION WITH A TUNABLE TRADEOFF BETWEEN SIGNAL DISTORTION AND ECHO SUPPRESSION |
1551 | DEEP S3PR: SIMULTANEOUS SOURCE SEPARATION AND PHASE RETRIEVAL USING DEEP GENERATIVE MODELS |
3717 | DEEP SEMI-SUPERVISED METRIC LEARNING VIA IDENTIFICATION OF MANIFOLD MEMBERSHIPS |
2770 | DEEP TRANSFORM AND METRIC LEARNING NETWORKS |
5011 | Deep Unfolding Network for Block-Sparse Signal Recovery |
1795 | DEEP WEIGHTED MMSE DOWNLINK BEAMFORMING |
4179 | DeepEmoCluster: A Semi-Supervised Framework for Latent Cluster Representation of Speech Emotions |
2201 | DEEPF0: END-TO-END FUNDAMENTAL FREQUENCY ESTIMATION FOR MUSIC AND SPEECH SIGNALS |
2401 | DeepNodule: Multi-task Learning of Segmentation Bootstrap for Pulmonary Nodule Detection |
4229 | DEEPTALK: VOCAL STYLE ENCODING FOR SPEAKER RECOGNITION AND SPEECH SYNTHESIS |
5195 | DEFICIENT BASIS ESTIMATION OF NOISE SPATIAL COVARIANCE MATRIX FOR RANK-CONSTRAINED SPATIAL COVARIANCE MATRIX ESTIMATION METHOD IN BLIND SPEECH EXTRACTION |
2327 | DEMYSTIFYING MODEL AVERAGING FOR COMMUNICATION-EFFICIENT FEDERATED MATRIX FACTORIZATION |
1895 | DENOISPEECH: DENOISING TEXT TO SPEECH WITH FRAME-LEVEL NOISE MODELING |
3631 | DENSE ATTENTION MODULE FOR ACCURATE PULMONARY NODULE DETECTION |
1399 | Dense Feature Pyramid Grids Network for Single Image Deraining |
5448 | DENSELY CONNECTED MULTI-STAGE MODEL WITH CHANNEL WISE SUBBAND FEATURE FOR REAL-TIME SPEECH ENHANCEMENT |
3547 | DEPENDENCE-GUIDED MULTI-VIEW CLUSTERING |
2463 | DEPRESSION DETECTION BY ANALYSING EYE MOVEMENTS ON EMOTIONAL IMAGES |
5384 | DESIGN OF GRAPH SIGNAL SAMPLING MATRICES FOR ARBITRARY SIGNAL SUBSPACES |
5575 | DESIGNING RANDOM FM RADAR WAVEFORMS WITH COMPACT SPECTRUM |
3788 | DETECTING ACOUSTIC REFLECTORS USING A ROBOT’S EGO-NOISE |
5382 | DETECTING ADVERSARIAL ATTACKS ON AUDIOVISUAL SPEECH RECOGNITION |
3416 | DETECTING ALZHEIMER'S DISEASE FROM SPEECH USING NEURAL NETWORKS WITH BOTTLENECK FEATURES AND DATA AUGMENTATION |
5389 | DETECTING SIGNAL CORRUPTIONS IN VOICE RECORDINGS FOR SPEECH THERAPY |
2772 | DETECTION OF AUDIO-VIDEO SYNCHRONIZATION ERRORS VIA EVENT DETECTION |
2828 | DETECTION OF COVID-19 THROUGH THE ANALYSIS OF VOCAL FOLD OSCILLATIONS |
4167 | DETECTION OF MALICIOUS DNS AND WEB SERVERS USING GRAPH-BASED APPROACHES |
2748 | DETECTION OF POST-TRAUMATIC STRESS DISORDER USING LEARNED TIME-FREQUENCY REPRESENTATIONS FROM PUPILLOMETRY |
1892 | DEVELOPING REAL-TIME STREAMING TRANSFORMER TRANSDUCER FOR SPEECH RECOGNITION ON LARGE-SCALE DATASET |
3488 | DEVELOPMENT OF THE CUHK ELDERLY SPEECH RECOGNITION SYSTEM FOR NEUROCOGNITIVE DISORDER DETECTION USING THE DEMENTIABANK CORPUS |
1943 | DFDM: A DEEP FEATURE DECOUPLING MODULE FOR LUNG NODULE SEGMENTATION |
3985 | DHASP: DIFFERENTIABLE HEARING AID SPEECH PROCESSING |
3510 | DHCN: DEEP HIERARCHICAL CONTEXT NETWORKS FOR IMAGE ANNOTATION |
3354 | DIDISPEECH: A LARGE SCALE MANDARIN SPEECH CORPUS |
2096 | Differentiable Signal Processing With Black-Box Audio Effects |
2352 | DIFFERENTIAL CHAOS SHIFT KEYING-BASED WIRELESS POWER TRANSFER |
1834 | Differential Convolution Feature Guided Deep Multi-scale Multiple Instance Learning for Aerial Scene Classification |
4011 | Dimension Selected Subspace Clustering |
3959 | DIRECTION OF ARRIVAL ESTIMATION FOR NON-COHERENT SUB-ARRAYS VIA JOINT SPARSE AND LOW-RANK SIGNAL RECOVERY |
2547 | DIRECTION PRESERVING WIND NOISE REDUCTION OF B-FORMAT SIGNALS |
4060 | DIRECTIONAL ASR: A NEW PARADIGM FOR E2E MULTI-SPEAKER SPEECH RECOGNITION WITH SOURCE LOCALIZATION |
1410 | DIRECTIONAL SPARSE FILTERING USING WEIGHTED LEHMER MEAN FOR BLIND SEPARATION OF UNBALANCED SPEECH MIXTURES |
3916 | DISCRETE COSINE TRANSFORM BASED CAUSAL CONVOLUTIONAL NEURAL NETWORK FOR DRIFT COMPENSATION IN CHEMICAL SENSORS |
3352 | DISCRIMINABILITY OF SINGLE-LAYER GRAPH NEURAL NETWORKS |
5256 | DISENTANGLED SPEAKER AND LANGUAGE REPRESENTATIONS USING MUTUAL INFORMATION MINIMIZATION AND DOMAIN ADAPTATION FOR CROSS-LINGUAL TTS |
2265 | DISENTANGLEMENT FOR AUDIO-VISUAL EMOTION RECOGNITION USING MULTITASK SETUP |
1425 | DISENTANGLING SUBJECT-DEPENDENT/-INDEPENDENT REPRESENTATIONS FOR 2D MOTION RETARGETING |
2727 | DISTRIBUTED SCHEDULING USING GRAPH NEURAL NETWORKS |
4132 | DISTRIBUTED SPEECH SEPARATION IN SPATIALLY UNCONSTRAINED MICROPHONE ARRAYS |
2225 | DISTRIBUTION-AWARE HIERARCHICAL WEIGHTING METHOD FOR DEEP METRIC LEARNING |
5329 | DIVIDE AND CONQUER: ONE-BIT MIMO-OFDM DETECTION BY INEXACT EXPECTATION MAXIMIZATION |
1398 | DNANet: Dense Nested Attention Network for Single Image Dehazing |
4095 | DNSMOS: A NON-INTRUSIVE PERCEPTUAL OBJECTIVE SPEECH QUALITY METRIC TO EVALUATE NOISE SUPPRESSORS |
4267 | DO AS I MEAN, NOT AS I SAY: SEQUENCE LOSS TRAINING FOR SPOKEN LANGUAGE UNDERSTANDING |
2603 | DOA ESTIMATION OF A HIDDEN RF SOURCE EXPLOITING SIMPLE BACKSCATTER RADIO TAGS |
1128 | Domain Adaptation for Learning Generator from Paired Few-Shot Data |
4344 | DOMAIN-ADVERSARIAL AUTOENCODER WITH ATTENTION BASED FEATURE LEVEL FUSION FOR SPEECH EMOTION RECOGNITION |
1394 | DOMAIN-AWARE NEURAL LANGUAGE MODELS FOR SPEECH RECOGNITION |
1255 | DOMESTIC ACTIVITIES CLUSTERING FROM AUDIO RECORDINGS USING CONVOLUTIONAL CAPSULE AUTOENCODER NETWORK |
5014 | DON’T LOOK BACK: AN ONLINE BEAT TRACKING METHOD USING RNN AND ENHANCED PARTICLE FILTERING |
1903 | DON’T SHOOT BUTTERFLY WITH RIFLES: MULTI-CHANNEL CONTINUOUS SPEECH SEPARATION WITH EARLY EXIT TRANSFORMER |
1837 | Double Multi-Head Attention for Speaker Verification |
2618 | Double-DCCCAE: Estimation of Body Gestures from Speech Waveform |
5524 | DOUBLE-LINEAR THOMPSON SAMPLING FOR CONTEXT-ATTENTIVE BANDITS |
5360 | DP-SIGNSGD: WHEN EFFICIENCY MEETS PRIVACY AND ROBUSTNESS |
1358 | DP-VTON: TOWARD DETAIL-PRESERVING IMAGE-BASED VIRTUAL TRY-ON NETWORK |
1743 | DrawGAN: Text to Image Synthesis with Drawing Generative Adversarial Networks |
1487 | Drawing Order Recovery from Trajectory Components |
3658 | DUAL METRIC DISCRIMINATOR FOR OPEN SET VIDEO DOMAIN ADAPTATION |
3542 | DUALFORMER: A UNIFIED BIDIRECTIONAL SEQUENCE-TO-SEQUENCE LEARNING |
5107 | Dual-Path Modeling for Long Recording Speech Separation in Meetings |
1411 | DUAL-STREAM NETWORK BASED ON GLOBAL GUIDANCE FOR SALIENT OBJECT DETECTION |
4953 | DURAS: Deep Unfolded Radar Sensing Using Doppler Focusing |
2198 | D-VDAMP: DENOISING-BASED APPROXIMATE MESSAGE PASSING FOR COMPRESSIVE MRI |
3795 | DYNAMIC CURRICULUM LEARNING VIA DATA PARAMETERS FOR NOISE ROBUST KEYWORD SPOTTING |
1722 | Dynamic Graph Learning based on Graph Laplacian |
4787 | DYNAMIC GRAPH MODELING OF SIMULTANEOUS EEG AND EYE-TRACKING DATA FOR READING TASK IDENTIFICATION |
5118 | DYNAMIC POINT CLOUD COMPRESSION USING A CUBOID ORIENTED DISCRETE COSINE BASED MOTION MODEL |
3870 | DYNAMIC RESOURCE OPTIMIZATION FOR ADAPTIVE FEDERATED LEARNING AT THE WIRELESS NETWORK EDGE |
2788 | Dynamic Sparsity Neural Networks for Automatic Speech Recognition |
1131 | DYNAMIC TEXTURE RECOGNITION VIA NUCLEAR DISTANCES ON KERNELIZED SCATTERING HISTOGRAM SPACES |
3067 | EADNET: EFFICIENT ASYMMETRIC DILATED NETWORK FOR SEMANTIC SEGMENTATION |
4781 | EAT: ENHANCED ASR-TTS FOR SELF-SUPERVISED SPEECH RECOGNITION |
2170 | ECCL: EXPLICIT CORRELATION-BASED CONVOLUTION BOUNDARY LOCATOR FOR MOMENT LOCALIZATION |
2174 | ECG HEART-BEAT CLASSIFICATION USING MULTIMODAL IMAGE FUSION |
4561 | ECHO STATE SPEECH RECOGNITION |
5373 | EDGE-AWARE MULTI-SCALE PROGRESSIVE COLORIZATION |
2878 | EEG-BASED EMOTION CLASSIFICATION USING GRAPH SIGNAL PROCESSING |
2583 | Effect of Language Proficiency on Subjective Evaluation of Noise Suppression Algorithms |
5450 | EFFECT OF NOISE AND MODEL COMPLEXITY ON DETECTION OF AMYOTROPHIC LATERAL SCLEROSIS AND PARKINSON’S DISEASE USING PITCH AND MFCC |
4272 | EFFECT OF VIDEO PIXEL-BINNING ON SOURCE ATTRIBUTION OF MIXED MEDIA |
5161 | EFFECTIVE RANK-BASED ESTIMATION OF THE COHERENT-TO-DIFFUSE POWER RATIO |
4889 | Efficient Adversarial Audio Synthesis via Progressive Upsampling |
3512 | EFFICIENT CLIENT CONTRIBUTION EVALUATION FOR HORIZONTAL FEDERATED LEARNING |
4144 | EFFICIENT END-TO-END AUDIO EMBEDDINGS GENERATION FOR AUDIO CLASSIFICATION ON TARGET APPLICATIONS |
2909 | EFFICIENT FACE MANIPULATION VIA DEEP FEATURE DISENTANGLEMENT AND REINTEGRATION NET |
4182 | EFFICIENT KNOWLEDGE DISTILLATION FOR RNN-TRANSDUCER MODELS |
1226 | EFFICIENT LONG PERIODIC BINARY SEQUENCE DESIGNS FOR AUTOMOTIVE RADAR |
4535 | EFFICIENT MIGRATION TO THE NEXT GENERATION OF NETWORKS BASED ON DIGITAL ANNEALING |
3079 | EFFICIENT MULTI-OBJECTIVE GANS FOR IMAGE RESTORATION |
4107 | EFFICIENT NETWORK PROTECTION GAMES AGAINST MULTIPLE TYPES OF STRATEGIC ATTACKERS |
2780 | EFFICIENT POWER ALLOCATION USING GRAPH NEURAL NETWORKS AND DEEP ALGORITHM UNFOLDING |
1671 | EFFICIENT REAL-TIME VIDEO STABILIZATION WITH A NOVEL LEAST SQUARES FORMULATION |
1431 | EFFICIENT SPEECH EMOTION RECOGNITION USING MULTI-SCALE CNN AND ATTENTION |
1958 | EFFICIENT TRAINING DATA GENERATION FOR PHASE-BASED DOA ESTIMATION |
3134 | Efficient Use of End-to-end Data in Spoken Language Processing |
3666 | EGO-BASED ENTROPY MEASURES FOR STRUCTURAL REPRESENTATIONS ON GRAPHS |
4094 | EGO-GNNS: EXPLOITING EGO STRUCTURES IN GRAPH NEURAL NETWORKS |
5616 | EIGENVECTORS OF ORDINARY, GENERALIZED, CENTERED AND OFFSET DISCRETE FOURIER TRANSFORMS BASED ON LOOKUP TABLE METHODS: EFFICIENCY AND APPROXIMATION USES |
4506 | EKFNET: LEARNING SYSTEM NOISE STATISTICS FROM MEASUREMENT DATA |
5068 | ELBERT: FAST ALBERT WITH CONFIDENCE-WINDOW BASED EARLY EXIT |
5483 | Elliptical Shape Recovery from Blurred Pixels using Deep Learning |
1831 | Embedding Semantic Hierarchy in Discrete Optimal Transport for Risk Minimization |
1684 | EMFORMER: EFFICIENT MEMORY TRANSFORMER BASED ACOUSTIC MODEL FORLOW LATENCY STREAMING SPEECH RECOGNITION |
1863 | EMOTION CONTROLLABLE SPEECH SYNTHESIS USING EMOTION-UNLABELED DATASET WITH THE ASSISTANCE OF CROSS-DOMAIN SPEECH EMOTION RECOGNITION |
2923 | EMOTION RECOGNITION BY FUSING TIME SYNCHRONOUS AND TIME ASYNCHRONOUS REPRESENTATIONS |
3479 | EMPIRICALLY ACCELERATING SCALED GRADIENT PROJECTION USING DEEP NEURAL NETWORK FOR INVERSE PROBLEMS IN IMAGE PROCESSING |
3110 | Enabling Efficient and Expressive Spatial Keyword Queries on Encrypted Data |
4524 | ENCODER-DECODER BASED PITCH TRACKING AND JOINT MODEL TRAINING FOR MANDARIN TONE CLASSIFICATION |
1550 | END TO END LEARNING FOR CONVOLUTIVE MULTI-CHANNEL WIENER FILTERING |
3182 | END2END ACOUSTIC TO SEMANTIC TRANSDUCTION |
2383 | END-2-END MODELING OF SPEECH AND GAIT FROM PATIENTS WITH PARKINSON'S DISEASE: COMPARISON BETWEEN HIGH QUALITY VS. SMARTPHONE DATA |
3729 | END-TO-END ANTI-SPOOFING WITH RAWNET2 |
4989 | END-TO-END AUDIO-VISUAL SPEECH RECOGNITION WITH CONFORMERS |
3384 | END-TO-END DEREVERBERATION, BEAMFORMING, AND SPEECH RECOGNITION WITH IMPROVED NUMERICAL STABILITY AND ADVANCED FRONTEND |
4614 | END-TO-END DIARIZATION FOR VARIABLE NUMBER OF SPEAKERS WITH LOCAL-GLOBAL NETWORKS AND DISCRIMINATIVE SPEAKER EMBEDDINGS |
3887 | End-to-end learning of variational models and solvers for the resolution of interpolation problems |
4691 | End-to-end lyrics Recognition with Voice to Singing Style Transfer |
5246 | END-TO-END MULTI-ACCENT SPEECH RECOGNITION WITH UNSUPERVISED ACCENT MODELLING |
2064 | END-TO-END MULTI-CHANNEL TRANSFORMER FOR SPEECH RECOGNITION |
3459 | END-TO-END MULTILINGUAL AUTOMATIC SPEECH RECOGNITION FOR LESS-RESOURCED LANGUAGES: THE CASE OF FOUR ETHIOPIAN LANGUAGES |
1408 | END-TO-END SPEAKER DIARIZATION AS POST-PROCESSING |
4343 | END-TO-END SPOKEN LANGUAGE UNDERSTANDING USING TRANSFORMER NETWORKS AND SELF-SUPERVISED PRE-TRAINED FEATURES |
2872 | END-TO-END TEXT-TO-SPEECH USING LATENT DURATION BASED ON VQ-VAE |
3474 | ENERGY EFFICIENCY OPTIMIZATION TECHNIQUE FOR SWIPT-ENABLED MULTI-GROUP MULTICASTING SYSTEMS WITH HETEROGENEOUS USERS |
2804 | Energy Minimization for Federated Learning with IRS-Assisted Over-the-Air Computation |
3493 | Enhanced Automotive Target Detection through Radar and Communications Sensor Fusion |
1320 | ENHANCED BLIND CALIBRATION OF UNIFORM LINEAR ARRAYS WITH ONE-BIT QUANTIZATION BY KULLBACK-LEIBLER DIVERGENCE COVARIANCE FITTING |
3974 | ENHANCED STANDARD ESPRIT FOR OVERCOMING IMPERFECTIONS IN DOA ESTIMATION |
4052 | ENHANCING AUDIO AUGMENTATION METHODS WITH CONSISTENCY LEARNING |
4747 | ENHANCING DATA-FREE ADVERSARIAL DISTILLATION WITH ACTIVATION REGULARIZATION AND VIRTUAL INTERPOLATION |
2734 | Enhancing Deep Paraphrase Identification via Leveraging Word Alignment Information |
1280 | ENHANCING IMAGE STEGANOGRAPHY VIA STEGO GENERATION AND SELECTION |
4682 | ENHANCING INTO THE CODEC: NOISE ROBUST SPEECH CODING WITH VECTOR-QUANTIZED AUTOENCODERS |
1563 | ENHANCING MODEL ROBUSTNESS BY INCORPORATING ADVERSARIAL KNOWLEDGE INTO SEMANTIC REPRESENTATION |
3649 | ENHANCING MULTI-CHANNEL EEG CLASSIFICATION WITH GRAMIAN TEMPORAL GENERATIVE ADVERSARIAL NETWORKS |
1576 | ENSEMBLE COMBINATION BETWEEN DIFFERENT TIME SEGMENTATIONS |
2740 | ENSEMBLE DISTILLATION APPROACHES FOR GRAMMATICAL ERROR CORRECTION |
1608 | ENSEMBLING OBJECT DETECTORS FOR IMAGE AND VIDEO DATA ANALYSIS |
3953 | ENSURE: Ensemble Stein's Unbiased Risk Estimator for Unsupervised Learning |
2345 | ENVIRONMENT-INDEPENDENT WI-FI HUMAN ACTIVITY RECOGNITION WITH ADVERSARIAL NETWORK |
4795 | ERROR ESTIMATES IN SECOND-ORDER CONTINUOUS-TIME SIGMA-DELTA MODULATORS |
5425 | ERROR-DRIVEN FIXED-BUDGET ASR PERSONALIZATION FOR ACCENTED SPEAKERS |
3062 | Error-driven Pruning of Language Models for Virtual Assistants |
3538 | Estimating Fiedler value on large networks based on random walk observations |
5604 | ESTIMATING NETWORK PROCESSES VIA BLIND IDENTIFICATION OF MULTIPLE GRAPH FILTERS |
4566 | ESTIMATING SEVERITY OF DEPRESSION FROM ACOUSTIC FEATURES AND EMBEDDINGS OF NATURAL SPEECH |
5072 | ESTIMATION OF GROUNDWATER STORAGE VARIATIONS IN INDUS RIVER BASIN USING GRACE DATA |
2654 | ESTIMATION OF MICROPHONE CLUSTERS IN ACOUSTIC SENSOR NETWORKS USING UNSUPERVISED FEDERATED LEARNING |
5060 | ESTIMATION OF VISUAL FEATURES OF VIEWED IMAGE FROM INDIVIDUAL AND SHARED BRAIN INFORMATION BASED ON FMRI DATA USING PROBABILISTIC GENERATIVE MODEL |
3494 | Evaluation and Comparison of Three Source Direction-of-Arrival Estimators Using Relative Harmonic Coefficients |
4715 | EVENT-DRIVEN MODULO SAMPLING |
2977 | EVOLUTIONARY QUANTIZATION OF NEURAL NETWORKS WITH MIXED-PRECISION |
2458 | EVOLVING QUANTIZED NEURAL NETWORKS FOR IMAGE CLASSIFICATION USING A MULTI-OBJECTIVE GENETIC ALGORITHM |
4186 | Exact Linear Convergence Rate Analysis for Low-Rank Symmetric Matrix Completion via Gradient Descent |
3918 | EXPEDITING DISCOVERY IN NEURAL ARCHITECTURE SEARCH BY COMBINING LEARNING WITH PLANNING |
3127 | EXPLOITING NON-NEGATIVE MATRIX FACTORIZATION FOR BINAURAL SOUND LOCALIZATION IN THE PRESENCE OF DIRECTIONAL INTERFERENCE |
3268 | EXPLOITING THE DUAL-TREE COMPLEX WAVELET TRANSFORM FOR SHIP WAKE DETECTION IN SAR IMAGERY |
2361 | EXPLORING AUTOMATIC COVID-19 DIAGNOSIS VIA VOICE AND SYMPTOMS FROM CROWDSOURCED DATA |
1115 | Exploring the application of synthetic audio in training keyword spotters |
3866 | EXPLORING THE USE OF COMMON LABEL SET TO IMPROVE SPEECH RECOGNITION OF LOW RESOURCE INDIAN LANGUAGES |
1064 | EXPLORING VISUAL-AUDIO COMPOSITION ALIGNMENT NETWORK FOR QUALITY FASHION RETRIEVAL IN VIDEO |
1634 | EXPOSING GAN-GENERATED FACES USING INCONSISTENT CORNEAL SPECULAR HIGHLIGHTS |
5586 | EXTENDED NESTED ARRAYS FOR CONSECUTIVE VIRTUAL APERTURE ENHANCEMENT |
3852 | Extended Object Tracking with Automotive Radar Using B-Spline Chained Ellipses Model |
3581 | EXTENDING MUSIC BASED ON EMOTION AND TONALITY VIA GENERATIVE ADVERSARIAL NETWORK |
4284 | EXTENDING PARROTRON: AN END-TO-END, SPEECH CONVERSION AND SPEECH RECOGNITION MODEL FOR ATYPICAL SPEECH |
2709 | Extending the Reverse JPEG Compatibility Attack to Double Compressed Images |
1900 | Factorized CRF with batch normalization based on the entire training data |
2130 | FAILURE PREDICTION BY CONFIDENCE ESTIMATION OF UNCERTAINTY-AWARE DIRICHLET NETWORKS |
5583 | Fast Adaptive Reparametrization (FAR) With Application to Human Action Recognition |
2277 | Fast and Provable Robust PCA via Normalized Coherence Pursuit |
2607 | FAST AND ROBUST ADMM FOR BLIND SUPER-RESOLUTION |
4133 | FAST AND ROBUST STRATIFIED SELF-CALIBRATION USING TIME-DIFFERENCE-OF-ARRIVAL MEASUREMENTS |
4829 | FAST DCTTS: EFFICIENT DEEP CONVOLUTIONAL TEXT-TO-SPEECH |
2098 | FAST DECENTRALIZED LINEAR FUNCTIONS VIA SUCCESSIVE GRAPH SHIFT OPERATORS |
2366 | FAST GRAPH KERNEL WITH OPTICAL RANDOM FEATURES |
2351 | Fast Hierarchy Preserving Graph Embedding via Subspace Constraints |
4101 | FAST INVERSE MAPPING OF FACE GANS |
2264 | FAST LOCAL REPRESENTATION LEARNING WITH ADAPTIVE ANCHOR GRAPH |
2951 | Fast Manifold Landmarking Using Extreme Eigen-pairs |
3370 | FAST THRESHOLD OPTIMIZATION FOR MULTI-LABEL AUDIO TAGGING USING SURROGATE GRADIENT LEARNING |
1995 | FAST: FEATURE AGGREGATION FOR DETECTING SALIENT OBJECT IN REAL-TIME |
2175 | FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization |
5296 | FASTPITCH: PARALLEL TEXT-TO-SPEECH WITH PITCH PREDICTION |
2236 | FC2RN: A FULLY CONVOLUTIONAL CORNER REFINEMENT NETWORK FOR ACCURATE MULTI-ORIENTED SCENE TEXT DETECTION |
3410 | FCL-TACO2: TOWARDS FAST, CONTROLLABLE AND LIGHTWEIGHT TEXT-TO-SPEECH SYNTHESIS |
5391 | FEATURE INTEGRATION VIA SEMI-SUPERVISED ORDINALLY MULTI-MODAL GAUSSIAN PROCESS LATENT VARIABLE MODEL |
5233 | FEATURE REDUNDANCY MINING: DEEP LIGHT-WEIGHT IMAGE SUPER-RESOLUTION MODEL |
4067 | FEATURE REUSE FOR A RANDOMIZATION BASED NEURAL NETWORK |
2852 | FEDERATED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION |
4632 | FEDERATED ALGORITHM WITH BAYESIAN APPROACH: OMNI-FEDGE |
2298 | FEDERATED DROPOUT LEARNING FOR HYBRID BEAMFORMING WITH SPATIAL PATH INDEX MODULATION IN MULTI-USER MMWAVE-MIMO SYSTEMS |
1532 | FEDERATED LEARNING FROM BIG DATA OVER NETWORKS |
2751 | FEDERATED LEARNING WITH LOCAL DIFFERENTIAL PRIVACY: TRADE-OFFS BETWEEN PRIVACY, UTILITY, AND COMMUNICATION |
2101 | FEDERATED MARGINAL PERSONALIZATION FOR ASR RESCORING |
5623 | Feedforward Selective Fixed-Filter Active Noise Control: Algorithm and Implementation |
2966 | FEW-SHOT CONTINUAL LEARNING FOR AUDIO CLASSIFICATION |
3874 | FEW-SHOT IMAGE CLASSIFICATION WITH MULTI-FACET PROTOTYPES |
2823 | Few-shot Learning for CT Scan based COVID-19 Diagnosis |
1168 | Few-Shot Learning for Decoding Surface Electromyography for Hand Gesture Recognition |
3125 | Fiber-Sampled Stochastic Mirror Descent For Tensor Decomposition with beta-Divergence |
3853 | FiGLearn: Filter and Graph Learning using Optimal Transport |
3154 | FINE-GRAINED MRI RECONSTRUCTION USING ATTENTIVE SELECTION GENERATIVE ADVERSARIAL NETWORKS |
5242 | Fine-Grained Pose Temporal Memory Module for Video Pose Estimation and Tracking |
3765 | FINE-TUNING OF PRE-TRAINED END-TO-END SPEECH RECOGNITION WITH GENERATIVE ADVERSARIAL NETWORKS |
4226 | FIRST-ORDER FAST ALGORITHM FOR STRUCTURALLY OPTIMAL MULTI-GROUP MULTICAST BEAMFORMING IN LARGE-SCALE SYSTEMS |
3235 | FLOW-BASED SELF-SUPERVISED DENSITY ESTIMATION FOR ANOMALOUS SOUND DETECTION |
1916 | FMA-ETA: ESTIMATING TRAVEL TIME ENTIRELY BASED ON FFN WITH ATTENTION |
1047 | F-NET: FUSION NEURAL NETWORK FOR VEHICLE TRAJECTORY PREDICTION IN AUTONOMOUS DRIVING |
2807 | Focus on the present: a regularization method for the ASR source-target attention layer |
5602 | FOCUSING AND FREQUENCY SMOOTHING FOR ARBITRARY ARRAYS WITH APPLICATION TO SPEAKER LOCALIZATION |
2482 | FOCUSING-BASED WIDEBAND ADAPTIVE BEAMFORMING USING COVARIANCE MATRIX RECONSTRUCTION |
3077 | FONTNET: ON-DEVICE FONT UNDERSTANDING AND PREDICTION PIPELINE |
3850 | FOOLHD: FOOLING SPEAKER IDENTIFICATION BY HIGHLY IMPERCEPTIBLE ADVERSARIAL DISTURBANCES |
3691 | FORENSICABILITY OF DEEP NEURAL NETWORK INFERENCE PIPELINES |
4452 | Four-Dimensional High-Resolution Automotive Radar Imaging Exploiting Joint Sparse-Frequency and Sparse-Array Design |
4084 | FOURIER TRANSFORMATION AUTOENCODERS FOR ANOMALY DETECTION |
2193 | FOVEAL AVASCULAR ZONE SEGMENTATION OF OCTA IMAGES USING DEEP LEARNING APPROACH WITH UNSUPERVISED VESSEL SEGMENTATION |
2548 | FPGA HARDWARE DESIGN FOR PLENOPTIC 3D IMAGE PROCESSING ALGORITHM TARGETING A MOBILE APPLICATION |
1371 | FRAGMENTVC: ANY-TO-ANY VOICE CONVERSION BY END-TO-END EXTRACTING AND FUSING FINE-GRAINED VOICE FRAGMENTS WITH ATTENTION |
3694 | FRAME RATE UP-CONVERSION USING KEY POINT AGNOSTIC FREQUENCY-SELECTIVE MESH-TO-GRID RESAMPLING |
1318 | Frame-rate-aware Aggregation For Efficient Video Super-resolution |
5596 | FREQUENCY ESTIMATION IN COHERENT, PERIODIC PULSE TRAINS |
4814 | Frequency-Temporal Attention Network for Singing Melody Extraction |
5258 | FULL-DUPLEX MULTIFUNCTION TRANSCEIVER WITH JOINT CONSTANT ENVELOPE TRANSMISSION AND WIDEBAND RECEPTION |
4270 | FULLSUBNET: A FULL-BAND AND SUB-BAND FUSION MODEL FOR REAL-TIME SINGLE-CHANNEL SPEECH ENHANCEMENT |
3093 | FULLY-NEURAL APPROACH TO VEHICLE WEIGHING AND STRAIN PREDICTION ON BRIDGES USING WIRELESS ACCELEROMETERS |
5160 | FUNDAMENTAL FREQUENCY FEATURE NORMALIZATION AND DATA AUGMENTATION FOR CHILD SPEECH RECOGNITION |
4696 | FUNDAMENTAL TRADE-OFFS IN NOISY SUPER-RESOLUTION WITH SYNTHETIC APERTURES |
3751 | FUSING INFORMATION STREAMS IN END-TO-END AUDIO-VISUAL SPEECH RECOGNITION |
2036 | FUSING MULTITASK MODELS BY RECURSIVE LEAST SQUARES |
1656 | FUSION-BASED DIGITAL IMAGE CORRELATION FRAMEWORK FOR STRAIN MEASUREMENT |
3769 | FWB-NET: FRONT WHITE BALANCE NETWORK FOR COLOR SHIFT CORRECTION IN SINGLE IMAGE DEHAZING VIA ATMOSPHERIC LIGHT ESTIMATION |
5269 | GAN-BASED OUT-OF-DOMAIN DETECTION USING BOTH IN-DOMAIN AND OUT-OF-DOMAIN SAMPLES |
3094 | G-ARRAYS: GEOMETRIC ARRAYS FOR EFFICIENT POINT CLOUD PROCESSING |
1252 | GATE TRIMMING: ONE-SHOT CHANNEL PRUNING FOR EFFICIENT CONVOLUTIONAL NEURAL NETWORKS |
5085 | Gating Feature Dense Network for Single Anisotropic MR Image Super-resolution |
4093 | Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition |
2711 | Gaussian Process Temporal-Difference Learning with Scalability and Worst-Case Performance Guarantees |
2858 | GDTW: A NOVEL DIFFERENTIABLE DTW LOSS FOR TIME SERIES TASKS |
3768 | GENERAL TOTAL VARIATION REGULARIZED SPARSE BAYESIAN LEARNING FOR ROBUST BLOCK-SPARSE SIGNAL RECOVERY |
3028 | GENERALIZED KNOWLEDGE DISTILLATION FROM AN ENSEMBLE OF SPECIALIZED TEACHERS LEVERAGING UNSUPERVISED NEURAL CLUSTERING |
5363 | GENERALIZED POLYTOPIC MATRIX FACTORIZATION |
1710 | Generalized Thinned Coprime Array for DOA Estimation |
2311 | GENERATING EMPATHETIC RESPONSES BY INJECTING ANTICIPATED EMOTION |
5071 | GENERATING HUMAN READABLE TRANSCRIPT FOR AUTOMATIC SPEECH RECOGNITION WITH PRE-TRAINED LANGUAGE MODEL |
3872 | GENERATING NATURAL QUESTIONS FROM IMAGES FOR MULTIMODAL ASSISTANTS |
5301 | GENERATIVE INFORMATION FUSION |
3105 | GENERATIVE SPEECH CODING WITH PREDICTIVE VARIANCE REGULARIZATION |
4021 | GEOMETRIC SCATTERING ATTENTION NETWORKS |
1338 | GEOMETRY CONSISTENCY OF AUGMENTED REALITY BASED ON SEMANTICS |
3455 | GEOM-SPIDER-EM: FASTER VARIANCE REDUCED STOCHASTIC EXPECTATION MAXIMIZATION FOR NONCONVEX FINITE-SUM OPTIMIZATION |
2299 | GLOBAL-LOCALIZED AGENT GRAPH CONVOLUTION FOR MULTI-AGENT REINFORCEMENT LEARNING |
2302 | GLOBALLY OPTIMAL BEAMFORMING FOR RATE SPLITTING MULTIPLE ACCESS |
4062 | GPS-DENIED NAVIGATION USING SAR IMAGES AND NEURAL NETWORKS |
2692 | GRADRAKER-BASED PREDICTION ALGORITHMS ON MULTI-LAYER GRAPHS |
4823 | GRADUAL FEDERATED LEARNING USING SIMULATED ANNEALING |
1447 | GRAMIAN-BASED ADAPTIVE COMBINATION POLICIES FOR DIFFUSION LEARNING OVER NETWORKS |
4253 | GRANGER CAUSALITY BASED DIRECTIONAL PHASE-AMPLITUDE COUPLING MEASURE |
1736 | GRAPH ATTENTION AND INTERACTION NETWORK WITH MULTI-TASK LEARNING FOR FACT VERIFICATION |
4354 | GRAPH ATTENTION NETWORKS FOR SPEAKER VERIFICATION |
4900 | GRAPH EMBEDDING USING MULTI-LAYER ADJACENT POINT MERGING MODEL |
3759 | GRAPH ENHANCED QUERY REWRITING FOR SPOKEN LANGUAGE UNDERSTANDING SYSTEM |
4333 | GRAPH FREQUENCY ANALYSIS OF COVID-19 INCIDENCE TO IDENTIFY COUNTY-LEVEL CONTAGION PATTERNS IN THE UNITED STATES |
5556 | Graph learning under spectral sparsity constraints |
3646 | GRAPH NEURAL NETWORK FOR LARGE-SCALE NETWORK LOCALIZATION |
3446 | GRAPH NEURAL NETWORKS FOR DECENTRALIZED CONTROLLERS |
1222 | Graph Signal Compression via Task-Based Quantization |
5095 | GRAPH SIGNAL DENOISING USING NESTED-STRUCTURED DEEP ALGORITHM UNROLLING |
2689 | Graph signal denoising via unrolling networks |
1676 | Graph-Adaptive Incremental learning using an ensemble of Gaussian process experts |
3555 | Graph-Based Pyramid Global Context Reasoning with A Saliency-Aware Projection for COVID-19 Lung Infections Segmentation |
3419 | GRAPHCOMM: A GRAPH NEURAL NETWORK BASED METHOD FOR MULTI-AGENT REINFORCEMENT LEARNING |
3429 | Graph-Homomorphic Perturbations for Private Decentralized Learning |
3372 | GraphNet: Graph Clustering with Deep Neural Networks |
1842 | GRAPHON AND GRAPH NEURAL NETWORK STABILITY |
5299 | GRAPHSPEECH: SYNTAX-AWARE GRAPH ATTENTION NETWORK FOR NEURAL SPEECH SYNTHESIS |
4348 | Grid Optimization for Matrix-based Source Localization under Inhomogeneous Sensor Topology |
5579 | GROOVE2GROOVE: ONE-SHOT MUSIC STYLE TRANSFER WITH SUPERVISION FROM SYNTHETIC DATA |
4338 | GTA-NET: GRADUAL TEMPORAL AGGREGATION NETWORK FOR FAST VIDEO DERAINING |
5125 | Guaranteed reconstruction from integrate-and-fire neurons with alpha synaptic activation |
4901 | Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier |
3644 | HANDLING CLASS IMBALANCE IN LOW-RESOURCE DIALOGUE SYSTEMS BY COMBINING FEW-SHOT CLASSIFICATION AND INTERPOLATION |
4135 | HANDWRITTEN DIGITS RECONSTRUCTION FROM UNLABELLED EMBEDDINGS |
5235 | HARDWARE IMPLEMENTATION OF ITERATIVE PROJECTION-AGGREGATION DECODING OF REED-MULLER CODES |
1871 | HAVE YOU MADE A DECISION? WHERE? A PILOT STUDY ON INTERPRETABILITY OF POLARITY ANALYSIS BASED ON ADVISING PROBLEM |
4661 | HCAG: A HIERARCHICAL CONTEXT-AWARE GRAPH ATTENTION MODEL FOR DEPRESSION DETECTION |
3892 | HCGM-NET: A DEEP UNFOLDING NETWORK FOR FINANCIAL INDEX TRACKING |
3587 | HEAD-SYNCHRONOUS DECODING FOR TRANSFORMER-BASED STREAMING ASR |
2462 | HEBBNET: A SIMPLIFIED HEBBIAN LEARNING FRAMEWORK TO DO BIOLOGICALLY PLAUSIBLE LEARNING |
4543 | HETEROGENEOUS TWO-STREAM NETWORK WITH HIERARCHICAL FEATURE PREFUSION FOR MULTISPECTRAL PAN-SHARPENING |
1229 | HFGCNet: High-frequency Graph Reasoning for Finer Semantic Image Segmentation |
1404 | H-GPR: A HYBRID STRATEGY FOR LARGE-SCALE GAUSSIAN PROCESS REGRESSION |
1582 | HIDDEN MARKOV MODEL DIARISATION WITH SPEAKER LOCATION INFORMATION |
4612 | Hide Chopin in the Music: Efficient Information Steganography via Random Shuffling |
1090 | Hierarchical Attention Fusion for Geo-Localization |
1587 | HIERARCHICAL ATTENTION-BASED TEMPORAL CONVOLUTIONAL NETWORKS FOR EEG-BASED EMOTION RECOGNITION |
4892 | HIERARCHICAL BIT-WISE DIFFERENTIAL CODING (HBDC) OF POINT CLOUD ATTRIBUTES |
4148 | HIERARCHICAL CODED ELASTIC COMPUTING |
1955 | HIERARCHICAL CONTEXT GUIDED AGGREGATION NETWORK FOR STEREO MATCHING |
2862 | HIERARCHICAL NETWORK BASED ON THE FUSION OF STATIC AND DYNAMIC FEATURES FOR SPEECH EMOTION RECOGNITION |
4173 | HIERARCHICAL POSE CLASSIFICATION FOR INFANT ACTION ANALYSIS AND MENTAL DEVELOPMENT ASSESSMENT |
3059 | HIERARCHICAL RECURRENT NEURAL NETWORK FOR HANDWRITTEN STROKES CLASSIFICATION |
5150 | Hierarchical Refined Attention For Scene Text Recognition |
3686 | HIERARCHICAL SIMILARITY LEARNING FOR LANGUAGE-BASED PRODUCT IMAGE RETRIEVAL |
3476 | HIERARCHICAL SPEAKER-AWARE SEQUENCE-TO-SEQUENCE MODEL FOR DIALOGUE SUMARIZATION |
4933 | HIERARCHICAL TRANSFORMER-BASED LARGE-CONTEXT END-TO-END ASR WITH LARGE-CONTEXT KNOWLEDGE DISTILLATION |
1533 | HIGCNN: HIERARCHICAL INTERLEAVED GROUP CONVOLUTIONAL NEURAL NETWORKS FOR POINT CLOUDS ANALYSIS |
2652 | HIGH ACCURACY TRACKING OF TARGETS USING MASSIVE MIMO |
3126 | High Fidelity Speech Regeneration with Application to Speech Enhancement |
2870 | HIGH-FREQUENCY ADVERSARIAL DEFENSE FOR SPEECH AND AUDIO |
4700 | HIGH-INTELLIGIBILITY SPEECH SYNTHESIS FOR DYSARTHRIC SPEAKERS WITH LPCNET-BASED TTS AND CYCLEVAE-BASED VC |
1610 | Highly Efficient Protection of Biometric Face Samples with Selective JPEG2000 Encryption |
4298 | HIGH-THROUGHPUT VLSI ARCHITECTURE FOR SOFT-DECISION DECODING WITH ORBGRAND |
2936 | HISTORY UTTERANCE EMBEDDING TRANSFORMER LM FOR SPEECH RECOGNITION |
4382 | HOCA: HIGHER-ORDER CHANNEL ATTENTION FOR SINGLE IMAGE SUPER-RESOLUTION |
1475 | HOW CONVOLUTIONAL NEURAL NETWORKS DEAL WITH ALIASING |
3840 | How Phonotactics Affect Multilingual and Zero-shot ASR Performance |
2831 | How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers? |
3291 | HOW TO MAKE TEXT-TO-SPEECH SYSTEM PRONOUNCE “VOLDEMORT”: AN EXPERIMENTAL APPROACH OF FOREIGN WORD PHONEMIZATION IN VIETNAMESE |
5262 | How to Use Time Information Effectively? Combining with Time Shift Module for Lipreading |
1439 | HSAN: A HIERARCHICAL SELF-ATTENTION NETWORK FOR MULTI-TURN DIALOGUE GENERATION |
1387 | HUBERT: HOW MUCH CAN A BAD TEACHER BENEFIT ASR PRE-TRAINING? |
1681 | HUMANACGAN: CONDITIONAL GENERATIVE ADVERSARIAL NETWORK WITH HUMAN-BASED AUXILIARY CLASSIFIER AND ITS EVALUATION IN PHONEME PERCEPTION |
2443 | HUMAN-AWARE COARSE-TO-FINE ONLINE ACTION DETECTION |
5460 | Human-centered Favorite Music Classification Using EEG-based Individual Music Preference via Deep Time-series CCA |
2696 | Human-Expert-Level Brain Tumor Detection Using Deep Learning with Data Distillation and Augmentation |
1718 | HVS-BASED PERCEPTUAL COLOR COMPRESSION OF IMAGE DATA |
1239 | HYBRID ANALOG-DIGITAL MIMO RADAR RECEIVERS WITH BIT-LIMITED ADCS |
1567 | HYBRID BEAMFORMING FOR WIDEBAND OFDM DUAL FUNCTION RADAR COMMUNICATIONS |
4439 | HYPERSPECTRAL IMAGE SUPER-RESOLUTION VIA ADJACENT SPECTRAL FUSION STRATEGY |
1327 | HYPOTHESIS STITCHER FOR END-TO-END SPEAKER-ATTRIBUTED ASR ON LONG-FORM MULTI-TALKER RECORDINGS |
3740 | ICA WITH ORTHOGONALITY CONSTRAINT: IDENTIFIABILITY AND A NEW EFFICIENT ALGORITHM |
4929 | ICASSP 2021 ACOUSTIC ECHO CANCELLATION CHALLENGE: DATASETS, TESTING FRAMEWORK, AND RESULTS |
4536 | ICASSP 2021 ACOUSTIC ECHO CANCELLATION CHALLENGE: INTEGRATED ADAPTIVE ECHO CANCELLATION WITH TIME ALIGNMENT AND DEEP LEARNING-BASED RESIDUAL ECHO PLUS NOISE SUPPRESSION |
4147 | ICASSP 2021 DEEP NOISE SUPPRESSION CHALLENGE |
2859 | ICASSP 2021 DEEP NOISE SUPPRESSION CHALLENGE: DECOUPLING MAGNITUDE AND PHASE OPTIMIZATION WITH A TWO-STAGE DEEP NETWORK |
2673 | ICI-AWARE PARAMETER ESTIMATION FOR MIMO-OFDM RADAR VIA APES SPATIAL FILTERING |
2048 | IDENTIFICATION OF DEEP BREATH WHILE MOVING FORWARD BASED ON MULTIPLE BODY REGIONS AND GRAPH SIGNAL ANALYSIS |
4334 | IDENTIFICATION OF UTERINE CONTRACTIONS BY AN ENSEMBLE OF GAUSSIAN PROCESSES |
5069 | IDENTIFYING FIRST-ORDER LOWPASS GRAPH SIGNALS USING PERRON FROBENIUS THEOREM |
1956 | IDENTIFYING SPAMMERS TO BOOST CROWDSOURCED CLASSIFICATION |
3195 | IMAGE CODING FOR MACHINES: AN END-TO-END LEARNED APPROACH |
4233 | IMAGE CODING WITH NEURAL NETWORK-BASED COLORIZATION |
3137 | Image Denoising Based on Correlation Adaptive Sparse Modeling |
2896 | IMAGE GENERATION BASED ON TEXTURE GUIDED VAE-AGAN FOR REGIONS OF INTEREST DETECTION IN REMOTE SENSING IMAGES |
2181 | IMAGE STEGANOGRAPHY BASED ON ITERATIVE ADVERSARIAL PERTURBATIONS ONTO A SYNCHRONIZED-DIRECTIONS SUB-IMAGE |
4812 | IMAGE SUPER-RESOLUTION USING MULTI-RESOLUTION ATTENTION NETWORK |
4553 | IMAGE-ASSISTED TRANSFORMER IN ZERO-RESOURCE MULTI-MODAL TRANSLATION |
2530 | Impact of Sound Duration and Inactive Frames on Sound Event Detection Performance |
5177 | Impact of speaking rate on the source filter Interaction in speech: a study |
2653 | Implicit HRTF Modeling Using Temporal Convolutional Networks |
3833 | Improved Atomic Norm Based Channel Estimation for Time-varying Narrowband Leaked Channels |
5590 | IMPROVED COVARIANCE MATRIX ESTIMATION WITH AN APPLICATION IN PORTFOLIO OPTIMIZATION |
4438 | IMPROVED DATA SELECTION FOR DOMAIN ADAPTATION IN ASR |
2910 | IMPROVED INTRA MODE CODING BEYOND AV1 |
3669 | IMPROVED MASK-CTC FOR NON-AUTOREGRESSIVE END-TO-END ASR |
3655 | IMPROVED NEURAL LANGUAGE MODEL FUSION FOR STREAMING RECURRENT NEURAL NETWORK TRANSDUCER |
4906 | Improved Probabilistic Context-Free Grammars for Passwords Using Word Extraction |
2469 | IMPROVED ROBUSTNESS TO DISFLUENCIES IN RNN-TRANSDUCER BASED SPEECH RECOGNITION |
4980 | IMPROVED STEP-SIZE SCHEDULES FOR NOISY GRADIENT METHODS |
2840 | IMPROVED SUPERVISED TRAINING OF PHYSICS-GUIDED DEEP LEARNING IMAGE RECONSTRUCTION WITH MULTI-MASKING |
4529 | IMPROVEMENTS TO PROSODIC ALIGNMENT FOR AUTOMATIC DUBBING |
3786 | IMPROVING AUDIO ANOMALIES RECOGNITION USING TEMPORAL CONVOLUTIONAL ATTENTION NETWORK |
4293 | IMPROVING AUTOMATIC DRUM TRANSCRIPTION USING LARGE-SCALE AUDIO-TO-MIDI ALIGNED DATA |
3171 | Improving Cross-domain Slot Filling with Common Syntactic Structure |
4428 | IMPROVING DEEP LEARNING SOUND EVENTS CLASSIFIERS USING GRAM MATRIX FEATURE-WISE CORRELATIONS |
4775 | IMPROVING DIALOGUE RESPONSE GENERATION VIA KNOWLEDGE GRAPH FILTER |
4363 | IMPROVING ENTITY RECALL IN AUTOMATIC SPEECH RECOGNITION WITH NEURAL EMBEDDINGS |
3414 | IMPROVING EVENT DETECTION BY EXPLOITING LABEL HIERARCHY |
2127 | IMPROVING IDENTIFICATION OF SYSTEM-DIRECTED SPEECH UTTERANCES BY DEEP LEARNING OF ASR-BASED WORD EMBEDDINGS AND CONFIDENCE METRICS |
4425 | IMPROVING INTRAOPERATIVE LIVER REGISTRATION IN IMAGE-GUIDED SURGERY WITH LEARNING-BASED RECONSTRUCTION |
3101 | Improving memory banks for unsupervised learning with large mini-batch, consistency and hard negative mining |
3731 | IMPROVING MULTIMODAL SPEECH ENHANCEMENT BY INCORPORATING SELF-SUPERVISED AND CURRICULUM LEARNING |
3058 | IMPROVING NATURALNESS AND CONTROLLABILITY OF SEQUENCE-TO-SEQUENCE SPEECH SYNTHESIS BY LEARNING LOCAL PROSODY REPRESENTATIONS |
1686 | IMPROVING NER IN SOCIAL MEDIA VIA ENTITY TYPE-COMPATIBLE UNKNOWN WORD SUBSTITUTION |
3021 | IMPROVING NEURAL TEXT NORMALIZATION WITH PARTIAL PARAMETER GENERATOR AND POINTER-GENERATOR NETWORK |
3240 | IMPROVING PRONUNCIATION ASSESSMENT VIA ORDINAL REGRESSION WITH ANCHORED REFERENCE SAMPLES |
2513 | IMPROVING PROSODY MODELLING WITH CROSS-UTTERANCE BERT EMBEDDINGS FOR END-TO-END SPEECH SYNTHESIS |
4807 | IMPROVING RECONSTRUCTION LOSS BASED SPEAKER EMBEDDING IN UNSUPERVISED AND SEMI-SUPERVISED SCENARIOS |
2612 | IMPROVING RNN TRANSDUCER MODELING FOR SMALL-FOOTPRINT KEYWORD SPOTTING |
2776 | IMPROVING RNN TRANSDUCER WITH TARGET SPEAKER EXTRACTION AND NEURAL UNCERTAINTY ESTIMATION |
3576 | IMPROVING SOUND EVENT DETECTION METRICS: INSIGHTS FROM DCASE 2020 |
4620 | IMPROVING SPEAKER VERIFICATION IN REVERBERANT ENVIRONMENTS |
1541 | Improving Stability of Adversarial Li-ion Cell Usage Data Generation using Generative Latent Space Modelling |
2785 | IMPROVING STREAMING AUTOMATIC SPEECH RECOGNITION WITH NON-STREAMING MODEL DISTILLATION ON UNSUPERVISED DATA |
3926 | IMPROVING THE CLASSIFICATION OF RARE CHORDS WITH UNLABELED DATA |
3714 | IMPROVING THE ENERGY-EFFICIENCY OF A KALMAN FILTER USING UNRELIABLE MEMORIES |
5622 | Improving the Harmony of the Composite Image by Spatial-Separated Attention Module |
5193 | IMPROVING THE ROBUSTNESS OF RIGHT WHALE DETECTION IN NOISY CONDITIONS USING DENOISING AUTOENCODERS AND AUGMENTED TRAINING |
4403 | IMPROVING ULTRASOUND TONGUE CONTOUR EXTRACTION USING U-NET AND SHAPE CONSISTENCY-BASED REGULARIZER |
3161 | IMRNET: AN ITERATIVE MOTION COMPENSATION AND RESIDUAL RECONSTRUCTION NETWORK FOR VIDEO COMPRESSED SENSING |
1245 | IN SITU CALIBRATION OF CROSS-SENSITIVE SENSORS IN MOBILE SENSOR ARRAYS USING FAST INFORMED NON-NEGATIVE MATRIX FACTORIZATION |
1125 | In-bed Pressure-based Pose Estimation using Image Space Representation Learning |
5191 | INCOMPLETE MULTI-VIEW SUBSPACE CLUSTERING WITH LOW-RANK TENSOR |
1709 | Incorporate Maximum Mean Discrepancy in Recurrent Latent Space for Sequential Generative Model |
1813 | INCORPORATING SYNTACTIC AND PHONETIC INFORMATION INTO MULTIMODAL WORD EMBEDDINGS USING GRAPH CONVOLUTIONAL NETWORKS |
1726 | INCORPORATING UNCERTAINTY IN DATA LABELING INTO DETECTION OF BRAIN INTERICTAL EPILEPTIFORM DISCHARGES FROM EEG USING WEIGHTED OPTIMIZATION |
3776 | INDEPENDENT SIGN LANGUAGE RECOGNITION WITH 3D BODY, HANDS, AND FACE RECONSTRUCTION |
2018 | INDEPENDENT VECTOR ANALYSIS USING SEMI-PARAMETRIC DENSITY ESTIMATION VIA MULTIVARIATE ENTROPY MAXIMIZATION |
2032 | Inertial Proximal Deep Learning Alternating Minimization for Efficient Neutral Network Training |
4925 | INFERRING HIGH-RESOLUTIONAL URBAN FLOW WITH INTERNET OF MOBILE THINGS |
2735 | Information and Regularization in Restricted Boltzmann Machines |
2925 | INFORMATION DECODING AND SDR IMPLEMENTATION OF DFRC SYSTEMS WITHOUT TRAINING SIGNALS |
2049 | INJECTING WORD INFORMATION WITH MULTI-LEVEL WORD ADAPTER FOR CHINESE SPOKEN LANGUAGE UNDERSTANDING |
1934 | Instance segmentation with the number of clusters incorporated in embedding learning |
4141 | INSTRUMENT CLASSIFICATION OF SOLO SHEET MUSIC IMAGES |
4026 | Integer Carrier Frequency Offset Estimation In OFDM with Zadoff-Chu Sequences |
1635 | INTEGRATED CLASSIFICATION AND LOCALIZATION OF TARGETS USING BAYESIAN FRAMEWORK IN AUTOMOTIVE RADARS |
4169 | INTEGRATED GRAD-CAM: SENSITIVITY-AWARE VISUAL EXPLANATION OF DEEP CONVOLUTIONAL NETWORKS VIA INTEGRATED GRADIENT-BASED SCORING |
3817 | INTEGRATING DEEP LEARNING WITH FIRST-ORDER LOGIC PROGRAMMED CONSTRAINTS FOR ZERO-DAY PHISHING ATTACK DETECTION |
3671 | Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds |
3558 | INTEGRATING SUBGRAPH-AWARE RELATION AND DIRECTION REASONING FOR QUESTION ANSWERING |
2071 | INTERFERENCE ANALYSIS IN RECONFIGURABLE INTELLIGENT SURFACE-ASSISTED MULTIPLE-INPUT MULTIPLE-OUTPUT SYSTEMS |
5115 | INTERMEDIATE LOSS REGULARIZATION FOR CTC-BASED SPEECH RECOGNITION |
1878 | INTERNAL LANGUAGE MODEL TRAINING FOR DOMAIN-ADAPTIVE END-TO-END SPEECH RECOGNITION |
4001 | INTERPOLATION OF IRREGULARLY SAMPLED FREQUENCY RESPONSE FUNCTIONS USING CONVOLUTIONAL NEURAL NETWORKS |
4279 | INTERPRETING GLOTTAL FLOW DYNAMICS FOR DETECTING COVID-19 FROM VOICE |
1389 | Introducing Deep Reinforcement Learning to NLU Ranking Tasks |
3308 | Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning |
5311 | INVESTIGATING THE EFFICACY OF MUSIC VERSION RETRIEVAL SYSTEMS FOR SETLIST IDENTIFICATION |
2921 | INVESTIGATION OF FAST AND EFFICIENT METHODS FOR MULTI-SPEAKER MODELING AND SPEAKER ADAPTATION |
1030 | ITERATIVE GEOMETRY CALIBRATION FROM DISTANCE ESTIMATES FOR WIRELESS ACOUSTIC SENSOR NETWORKS |
1420 | ITERATIVE REWEIGHTED ALGORITHMS FOR JOINT USER IDENTIFICATION AND CHANNEL ESTIMATION IN SPATIALLY CORRELATED MASSIVE MTC |
4835 | JAMMING STRATEGY GENERATION FOR HIDDEN COMMUNICATION MODES VIA GRAPH CONVOLUTION NETWORKS |
4706 | Joint Alignment Learning-Attention based Model for Grapheme-to-Phoneme Conversion |
5615 | JOINT AMPLITUDE AND PHASE REFINEMENT FOR MONAURAL SOURCE SEPARATION |
3170 | JOINT ASR AND LANGUAGE IDENTIFICATION USING RNN-T: AN EFFICIENT APPROACH TO DYNAMIC LANGUAGE SWITCHING |
4171 | JOINT CHANNEL, DATA, AND PHASE-NOISE ESTIMATION IN MIMO-OFDM SYSTEMS USING A TENSOR MODELING APPROACH |
3217 | JOINT COMMUNICATIONS WITH FH-MIMO RADAR SYSTEMS : AN EXTENDED SIGNALING STRATEGY |
3265 | JOINT COUPLED TRANSFORM LEARNING FRAMEWORK FOR MULTIMODAL IMAGE SUPER-RESOLUTION |
1933 | JOINT DEREVERBERATION AND SEPARATION WITH ITERATIVE SOURCE STEERING |
5609 | Joint DOD and DOA Estimation in Slow-Time MIMO Radar via PARAFAC Decomposition |
1972 | JOINT INTENT DETECTION AND SLOT FILLING BASED ON CONTINUAL LEARNING MODEL |
1896 | JOINT LEARNING OF IMAGE AESTHETIC QUALITY ASSESSMENT AND SEMANTIC RECOGNITION BASED ON FEATURE ENHANCEMENT |
2332 | JOINT LOCALIZATION AND PREDICTIVE BEAMFORMING IN VEHICULAR NETWORKS: POWER ALLOCATION BEYOND WATER-FILLING |
4384 | JOINT MASKED CPC AND CTC TRAINING FOR ASR |
1034 | JOINT MAXIMUM LIKELIHOOD ESTIMATION OF POWER SPECTRAL DENSITIES AND RELATIVE ACOUSTIC TRANSFER FUNCTIONS FOR ACOUSTIC BEAMFORMING |
3345 | JOINT MULTI-PITCH DETECTION AND SCORE TRANSCRIPTION FOR POLYPHONIC PIANO MUSIC |
2009 | JOINT OPTIMIZATION FOR FULL-DUPLEX CELLULAR COMMUNICATIONS VIA INTELLIGENT REFLECTING SURFACE |
3709 | JOINT OPTIMIZATION OF SPECTRALLY CO-EXISTING MULTI-CARRIER RADAR AND COMMUNICATION SYSTEMS IN CLUTTERED ENVIRONMENTS |
2541 | JOINT REINFORCEMENT LEARNING AND GAME THEORY BITRATE CONTROL METHOD FOR 360-DEGREE DYNAMIC ADAPTIVE STREAMING |
2525 | JOINTLY TRAINED TRANSFORMERS MODELS FOR SPOKEN LANGUAGE TRANSLATION |
3415 | KALMAN FILTER BASED MIMO CSI PHASE RECOVERY FOR COTS WIFI DEVICES |
3375 | Kalman Optimizer for Consistent Gradient Descent |
2257 | KALMANNET: DATA-DRIVEN KALMAN FILTERING |
3486 | KAN: KNOWLEDGE-AUGMENTED NETWORKS FOR FEW-SHOT LEARNING |
4508 | KARAOKE KEY RECOMMENDATION VIA PERSONALIZED COMPETENCE-BASED RATING PREDICTION |
5374 | KERNEL LEARNING WITH TENSOR NETWORKS |
2038 | KERNEL ORTHOGONAL NONNEGATIVE MATRIX FACTORIZATION: APPLICATION TO MULTISPECTRAL DOCUMENT IMAGE DECOMPOSITION |
4106 | KERNEL REGRESSION ON GRAPHS IN RANDOM FOURIER FEATURES SPACE |
3627 | KERNEL-BASED LIFELONG POLICY GRADIENT REINFORCEMENT LEARNING |
2600 | KERNEL-INTERPOLATION-BASED FILTERED-X LEAST MEAN SQUARE FOR SPATIAL ACTIVE NOISE CONTROL IN TIME DOMAIN |
2713 | KLD MINIMIZATION-BASED CONSTRAINED MEASUREMENT FILTERING FOR TWO-STEP TDOA INDOOR TRACKING |
2132 | KNOWLEDGE DISTILLATION FOR IMPROVED ACCURACY IN SPOKEN QUESTION ANSWERING |
2399 | KNOWLEDGE REASONING FOR SEMANTIC SEGMENTATION |
2675 | KNOWLEDGE TRANSFER FOR EFFICIENT ON-DEVICE FALSE TRIGGER MITIGATION |
2508 | KNOWLEDGE-BASED CHAT DETECTION WITH FALSE MENTION DISCRIMINATION |
1357 | Label-aware Text Representation for Multi-label Text Classification |
1330 | LABEL-GUIDED DICTIONARY PAIR LEARNING FOR ECG BIOMETRIC RECOGNITION |
2809 | LANGUAGE MODEL IS ALL YOU NEED: NATURAL LANGUAGE UNDERSTANDING AS QUESTION ANSWERING |
3378 | LANGUAGE-SENSITIVE MUSIC EMOTION RECOGNITION MODELS: ARE WE REALLY THERE YET? |
4197 | LAPLACIAN REGULARIZED TENSOR LOW-RANK MINIMIZATION FOR HYPERSPECTRAL SNAPSHOT COMPRESSIVE IMAGING |
5612 | Large Database Compression Based on Perceived Information |
2216 | LARGE MARGIN TRAINING IMPROVES LANGUAGE MODELS FOR ASR |
4909 | LASAFT: LATENT SOURCE ATTENTIVE FREQUENCY TRANSFORMATION FOR CONDITIONED SOURCE SEPARATION |
4307 | LATENT SPACE MOTION ANALYSIS FOR COLLABORATIVE INTELLIGENCE |
4986 | LATTICE-FREE MMI ADAPTATION OF SELF-SUPERVISED PRETRAINED ACOUSTIC MODELS |
3467 | LAYER-WISE INTERPRETATION OF DEEP NEURAL NETWORKS USING IDENTITY INITIALIZATION |
4850 | Leaky Integrator Dynamical Systems and Reachable Sets |
3537 | LEARNED DECIMATION FOR NEURAL BELIEF PROPAGATION DECODERS |
1347 | LEARNED TRANSFERABLE ARCHITECTURES CAN SURPASS HAND-DESIGNED ARCHITECTURES FOR LARGE SCALE SPEECH RECOGNITION |
2019 | Learning a Sparse Generative Non-Parametric Supervised Autoencoder |
1823 | LEARNING A TREE OF NEURAL NETS |
2229 | Learning Audio Embeddings with User Listening Data for Content-based Music Recommendation |
1828 | LEARNING AUDIO-VISUAL CORRELATIONS FROM VARIATIONAL CROSS-MODAL GENERATION |
1366 | LEARNING BINARY SEMANTIC EMBEDDING FOR BREAST HISTOLOGY IMAGE CLASSIFICATION AND RETRIEVAL |
2409 | LEARNING BOLLOBÁS-RIORDAN GRAPHS UNDER PARTIAL OBSERVABILITY |
4055 | Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags |
2330 | LEARNING DISCRIMINATIVE FEATURES FOR SEMI-SUPERVISED ANOMALY DETECTION |
3115 | LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR SPEECH ENHANCEMENT VIA ADVERSARIAL TRAINING |
2792 | LEARNING DISENTANGLED PHONE AND SPEAKER REPRESENTATIONS IN A SEMI-SUPERVISED VQ-VAE PARADIGM |
3523 | Learning double-compression video fingerprints left from social media platforms |
2590 | LEARNING FROM HETEROGENEOUS EEG SIGNALS WITH DIFFERENTIABLE CHANNEL REORDERING |
3212 | LEARNING INTEGRODIFFERENTIAL MODELS FOR IMAGE DENOISING |
3972 | LEARNING MIXED MEMBERSHIP FROM ADJACENCY GRAPH VIA SYSTEMATIC EDGE QUERY: IDENTIFIABILITY AND ALGORITHM |
5618 | LEARNING MIXTURES OF SEPARABLE DICTIONARIES FOR TENSOR DATA: ANALYSIS AND ALGORITHMS |
3061 | LEARNING MODEL-BLIND TEMPORAL DENOISERS WITHOUT GROUND TRUTHS |
5204 | LEARNING ON HETEROGENEOUS GRAPHS USING HIGH-ORDER RELATIONS |
5117 | LEARNING OPTIMAL LATTICE CODES FOR MIMO COMMUNICATIONS |
4208 | LEARNING POSE-ADAPTIVE LIP SYNC WITH CASCADED TEMPORAL CONVOLUTIONAL NETWORK |
2466 | LEARNING REPRESENTATION OF MULTI-SCALE OBJECT FOR FINE-GRAINED IMAGE RETRIEVAL |
2765 | LEARNING SEPARABLE TIME-FREQUENCY FILTERBANKS FOR AUDIO CLASSIFICATION |
3960 | LEARNING SPARSE GRAPH LAPLACIAN WITH K EIGENVECTOR PRIOR VIA ITERATIVE GLASSO AND PROJECTION |
1704 | LEARNING SPARSIFYING TRANSFORMS FOR IMAGE RECONSTRUCTION IN ELECTRICAL IMPEDANCE TOMOGRAPHY |
4251 | LEARNING THE RELEVANT SUBSTRUCTURES FOR TASKS ON GRAPH DATA |
2141 | Learning to Continuously Optimize Wireless Resource In Episodically Dynamic Environment |
2879 | LEARNING TO ESTIMATE KERNEL SCALE AND ORIENTATION OF DEFOCUS BLUR WITH ASYMMETRIC CODED APERTURE |
2518 | LEARNING TO SELECT CONTEXT IN A HIERARCHICAL AND GLOBAL PERSPECTIVE FOR OPEN-DOMAIN DIALOGUE GENERATION |
4289 | LEARNING TO SELECT FOR MIMO RADAR BASED ON HYBRID ANALOG-DIGITAL BEAMFORMING |
3993 | LEARNING WORD-LEVEL CONFIDENCE FOR SUBWORD END-TO-END ASR |
3767 | LEARNING-BASED LOSSLESS COMPRESSION OF 3D POINT CLOUD GEOMETRY |
2519 | LENGTH NO LONGER MATTERS: A REAL LENGTH ADAPTIVE ARRHYTHMIA CLASSIFICATION MODEL WITH MULTI-SCALE CONVOLUTION |
3008 | LESS IS MORE: IMPROVED RNN-T DECODING USING LIMITED LABEL CONTEXT AND PATH MERGING |
4145 | LEVERAGING A MULTIPLE-STRAIN MODEL WITH MUTATIONS IN ANALYZING THE SPREAD OF COVID-19 |
3454 | LEVERAGING ACOUSTIC AND LINGUISTIC EMBEDDINGS FROM PRETRAINED SPEECH AND LANGUAGE MODELS FOR INTENT CLASSIFICATION |
1629 | LEVERAGING THE STRUCTURE OF MUSICAL PREFERENCE IN CONTENT-AWARE MUSIC RECOMMENDATION |
1833 | LIFI: TOWARDS LINGUISTICALLY INFORMED FRAME INTERPOLATION |
4074 | LIGHT FIELD STYLE TRANSFER WITH LOCAL ANGULAR CONSISTENCY |
4364 | LIGHTSPEECH: LIGHTWEIGHT AND FAST TEXT TO SPEECH WITH NEURAL ARCHITECTURE SEARCH |
5209 | LIGHT-TTS: LIGHTWEIGHT MULTI-SPEAKER MULTI-LINGUAL TEXT-TO-SPEECH |
1300 | LIGHTWEIGHT AND ACCURATE SINGLE IMAGE SUPER-RESOLUTION WITH CHANNEL SEGREGATION NETWORK |
1644 | LIGHTWEIGHT AND INTERPRETABLE NEURAL MODELING OF AN AUDIO DISTORTION EFFECT USING HYPERCONDITIONED DIFFERENTIABLE BIQUADS |
5083 | Lightweight Dual-task Networks for Crowd Counting in Aerial Images |
1429 | LIGHTWEIGHT HUMAN POSE ESTIMATION UNDER RESOURCE-LIMITED SCENES |
2834 | LIGHTWEIGHT NON-LOCAL NETWORK FOR IMAGE SUPER-RESOLUTION |
1596 | LINEAR COMPUTATION CODING |
2381 | LINEAR MULTICHANNEL BLIND SOURCE SEPARATION BASED ON TIME-FREQUENCY MASK OBTAINED BY HARMONIC/PERCUSSIVE SOUND SEPARATION |
4782 | LITESING: TOWARDS FAST, LIGHTWEIGHT AND EXPRESSIVE SINGING VOICE SYNTHESIS |
4164 | LOCALLY OPTIMAL DETECTION OF STOCHASTIC TARGETED UNIVERSAL ADVERSARIAL PERTURBATIONS |
3037 | LONG-SHORT TEMPORAL MODELING FOR EFFICIENT ACTION RECOGNITION |
3700 | LOOKING THROUGH WALLS: INFERRING SCENES FROM VIDEO-SURVEILLANCE ENCRYPTED TRAFFIC |
3603 | LOOPNET: MUSICAL LOOP SYNTHESIS CONDITIONED ON INTUITIVE MUSICAL PARAMETERS |
5137 | LOW COMPLEXITY SECURE P-TENSOR PRODUCT COMPRESSED SENSING RECONSTRUCTION OUTSOURCING AND IDENTITY AUTHENTICATION IN CLOUD |
3048 | Low Complexity SLM for OFDMA System with Implicit Side Information |
1586 | LOW LATENCY ONLINE BLIND SOURCE SEPARATION BASED ON JOINT OPTIMIZATION WITH BLIND DEREVERBERATION |
3323 | LOW MUTUAL COUPLING SPARSE ARRAY DESIGN USING ULA FITTING |
3738 | LOW RESOURCE AUDIO-TO-LYRICS ALIGNMENT FROM POLYPHONIC MUSIC RECORDINGS |
5589 | LOW-COMPLEXITY METHODS FOR ESTIMATION AFTER PARAMETER SELECTION |
1768 | LOW-COMPLEXITY PARAMETER LEARNING FOR OTFS MODULATION BASED AUTOMOTIVE RADAR |
3878 | LOW-COMPLEXITY, REAL-TIME JOINT NEURAL ECHO CONTROL AND SPEECH ENHANCEMENT BASED ON PERCEPNET |
4176 | LOW-DIMENSIONAL DENOISING EMBEDDING TRANSFORMER FOR ECG CLASSIFICATION |
4904 | LOW-LATENCY POLAR DECODER USING OVERLAPPED SCL PROCESSING |
5020 | LOW-RANK AND SPARSE DECOMPOSITION FOR JOINT DOA ESTIMATION AND CONTAMINATED SENSORS DETECTION WITH SPARSELY CONTAMINATED ARRAYS |
4406 | Low-rank on Graphs plus Temporally Smooth Sparse Decomposition for Anomaly Detection in Spatiotemporal Data |
3233 | LOW-RESOURCE EXPRESSIVE TEXT-TO-SPEECH USING DATA AUGMENTATION |
1082 | L-RED: EFFICIENT POST-TRAINING DETECTION OF IMPERCEPTIBLE BACKDOOR ATTACKS WITHOUT ACCESS TO THE TRAINING SET |
3524 | LSSED: A LARGE-SCALE DATASET AND BENCHMARK FOR SPEECH EMOTION RECOGNITION |
1918 | LTAF-NET: LEARNING TASK-AWARE ADAPTIVE FEATURES AND REFINING MASK FOR FEW-SHOT SEMANTIC SEGMENTATION |
3858 | LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation |
4507 | MACHINE TRANSLATION VERBOSITY CONTROL FOR AUTOMATIC DUBBING |
1278 | m-Activity: ACCURATE AND REAL-TIME HUMAN ACTIVITY RECOGNITION VIA MILLIMETER WAVE RADAR |
1919 | MAEC: Multi-instance learning with an Adversarial Auto-encoder-based Classifier for Speech Emotion Recognition |
4265 | MAKF-SR: Multi-Agent Adaptive Kalman Filtering-based Successor Representations |
3083 | MAKING PUNCTUATION RESTORATION ROBUST AND FAST WITH MULTI-TASK LEARNING AND KNOWLEDGE DISTILLATION |
4578 | MAPGN: MAsked Pointer-Generator Network for Sequence-to-Sequence Pre-training |
2931 | MARBLENET: DEEP 1D TIME-CHANNEL SEPARABLE CONVOLUTIONAL NEURAL NETWORK FOR VOICE ACTIVITY DETECTION |
5601 | Mask Combination of Multi-Layer Graphs for Global Structure Inference |
1208 | MASK4D: 4D CONVOLUTION NETWORK FOR LIGHT FIELD OCCLUSION REMOVAL |
5310 | MASKCYCLEGAN-VC: LEARNING NON-PARALLEL VOICE CONVERSION WITH FILLING IN FRAMES |
5225 | MATCHING AS COLOR IMAGES: THERMAL IMAGE LOCAL FEATURE DETECTION AND DESCRIPTION |
5297 | MAXIMUM A POSTERIORI ESTIMATOR FOR CONVOLUTIVE SOUND SOURCE SEPARATION WITH SUB-SOURCE BASED NTF MODEL AND THE LOCALIZATION PROBABILISTIC PRIOR ON THE MIXING MATRIX |
5607 | MAXIMUM ENTROPY-BASED INTERFERENCE-PLUS-NOISE COVARIANCE MATRIX RECONSTRUCTION FOR ROBUST ADAPTIVE BEAMFORMING |
3339 | MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK |
1149 | MCR-Net: A Multi-Step Co-Interactive Relation Network for Unanswerable Questions on Machine Reading Comprehension |
1910 | MEASUREMENT CODING FRAMEWORK WITH ADJACENT PIXELS BASED MEASUREMENT MATRIX FOR COMPRESSIVELY SENSED IMAGES |
1386 | MEASURE-TRANSFORMED COVARIANCE TEST FOR ROBUST SPECTRUM SENSING |
5582 | Measure-Transformed MVDR Beamforming |
3015 | MELODY HARMONIZATION USING ORDERLESS NADE, CHORD BALANCING, AND BLOCKED GIBBS SAMPLING |
1463 | MELON PLAYLIST DATASET: A PUBLIC DATASET FOR AUDIO-BASED PLAYLIST GENERATION AND MUSIC TAGGING |
3520 | MEMORY LAYERS WITH MULTI-HEAD ATTENTION MECHANISMS FOR TEXT-DEPENDENT SPEAKER VERIFICATION |
3830 | MEMORY-EFFICIENT SPEECH RECOGNITION ON SMART DEVICES |
3945 | MESSAGE TRANSMISSION OVER RAPIDLY TIME-VARYING CHANNELS |
2228 | Meta Ordinal Weighting Net For Improving Lung Nodule Classification |
5344 | META-ADAPTER: EFFICIENT CROSS-LINGUAL ADAPTATION WITH META-LEARNING |
5367 | Meta-cognition-based Simple and Effective Approach to Object Detection |
5124 | Meta-Learning for 6G Communication Networks with Reconfigurable Intelligent Surfaces |
3622 | META-LEARNING FOR CROSS-CHANNEL SPEAKER VERIFICATION |
4780 | META-LEARNING FOR IMPROVING RARE WORD RECOGNITION IN END-TO-END ASR |
5196 | META-LEARNING FOR LOW-RESOURCE SPEECH EMOTION RECOGNITION |
3988 | META-LEARNING WITH ATTENTION FOR IMPROVED FEW-SHOT LEARNING |
3356 | MICAUGMENT: ONE-SHOT MICROPHONE STYLE TRANSFER |
4220 | MICROSOFT SPEAKER DIARIZATION SYSTEM FOR THE VOXCELEB SPEAKER RECOGNITION CHALLENGE 2020 |
2681 | MILLIMETER WAVE MIMO CHANNEL ESTIMATION WITH 1-BIT SPATIAL SIGMA-DELTA ANALOG-TO-DIGITAL CONVERTERS |
4329 | MIND THE BEAT: DETECTING AUDIO ONSETS FROM EEG RECORDINGS OF MUSIC LISTENING |
2226 | Minimizing Weighted Concave Impurity Partition Under Constraints |
1553 | MINIMUM BAYES RISK TRAINING FOR END-TO-END SPEAKER-ATTRIBUTED ASR |
3085 | MISALIGNMENT RECOGNITION IN ACOUSTIC SENSOR NETWORKS USING A SEMI-SUPERVISED SOURCE ESTIMATION METHOD AND MARKOV RANDOM FIELDS |
1141 | MISPRONUNCIATION DETECTION IN NON-NATIVE (L2) ENGLISH WITH UNCERTAINTY MODELING |
2799 | MITIGATING CLIPPING DISTORTION IN OFDM USING DEEP RESIDUAL LEARNING |
3556 | MITIGATING INTER-SUBJECT BRAIN SIGNAL VARIABILITY FOR EEG-BASED DRIVER FATIGUE STATE CLASSIFICATION |
5603 | MIXED MONOTONIC PROGRAMMING FOR FAST GLOBAL OPTIMIZATION |
3457 | MIXED PRECISION QUANTIZATION OF TRANSFORMER LANGUAGE MODELS FOR SPEECH RECOGNITION |
2230 | MIXSPEECH: DATA AUGMENTATION FOR LOW-RESOURCE AUTOMATIC SPEECH RECOGNITION |
2755 | MIXTURE OF INFORMED EXPERTS FOR MULTILINGUAL SPEECH RECOGNITION |
1788 | Mixup Regularized Adversarial Networks for Multi-Domain Text Classification |
1981 | MODELING HOMOPHONE NOISE FOR ROBUST NEURAL MACHINE TRANSLATION |
1365 | MODEL-INSPIRED DEEP LEARNING FOR LIGHT-FIELD MICROSCOPY WITH APPLICATION TO NEURON LOCALIZATION |
4313 | Modelling Paralinguistic Properties in Conversational Speech to Detect Bipolar Disorder and Borderline Personality Disorder |
1440 | MODIFIED ARCSINE LAW FOR ONE-BIT SAMPLED STATIONARY SIGNALS WITH TIME-VARYING THRESHOLDS |
1237 | Modular Binary Tree Architecture for Distributed Large Intelligent Surface |
1362 | MODUREC: RECOMMENDER SYSTEMS WITH FEATURE AND TIME MODULATION |
2248 | MONAURAL SPEECH ENHANCEMENT WITH COMPLEX CONVOLUTIONAL BLOCK ATTENTION MODULE AND JOINT TIME FREQUENCY LOSSES |
3497 | MORE: A METRIC LEARNING BASED FRAMEWORK FOR OPEN-DOMAIN RELATION EXTRACTION |
3893 | MOVEMENT DETECTION USING A RECIPROCAL RECEIVED SIGNAL STRENGTH MODEL |
3097 | MOVING OBJECT CLASSIFICATION WITH A SUB-6 GHZ MASSIVE MIMO ARRAY USING REAL DATA |
1249 | MPDNet: A 3D Missing Part Detection Network Based on Point Cloud Segmentation |
3966 | MRI IMAGE RECOVERY USING DAMPED DENOISING VECTOR AMP |
4422 | MS-CSPN: MULTI-SCALE CASCADE SPATIAL PYRAMID NETWORK FOR OBJECT DETECTION |
4466 | MSR-GAN: Multi-Segment Reconstruction via Adversarial Learning |
3599 | MUG : A MULTIPATH-EXPLOITED AND GRID-FREE LOCALISATION METHOD |
1641 | MULTI PATH TRAINING FRAMEWORK FOR DATA-DRIVEN OPEN-DOMAIN CONVERSATION SYSTEM |
2699 | MULTI-BRANCH TOMLINSON-HARASHIMA PRECODING FOR RATE SPLITTING BASED SYSTEMS WITH MULTIPLE ANTENNAS |
1367 | MULTICHANNEL OVERLAPPING SPEAKER SEGMENTATION USING MULTIPLE HYPOTHESIS TRACKING OF ACOUSTIC AND SPATIAL FEATURES |
4287 | Multi-Channel Speech Enhancement using Graph Neural Networks |
1448 | MULTI-CHANNEL TARGET SPEECH EXTRACTION WITH CHANNEL DECORRELATION AND TARGET SPEAKER ADAPTATION |
5156 | Multichannel-based learning for audio object extraction |
4510 | MULTI-DECODER DPRNN: SOURCE SEPARATION FOR VARIABLE NUMBER OF SPEAKERS |
5592 | Multi-Delay Sparse Approach to Residual Crosstalk Reduction for Blind Source Separation |
3736 | MULTI-DIALECT SPEECH RECOGNITION IN ENGLISH USING ATTENTION ON ENSEMBLE OF EXPERTS |
5152 | MULTI-DIRECTIONAL CONVOLUTION NETWORKS WITH SPATIAL-TEMPORAL FEATURE PYRAMID MODULE FOR ACTION RECOGNITION |
2560 | Multi-Entity Collaborative Relation Extraction |
1984 | MULTI-GRANULARITY FEATURE INTERACTION AND RELATION REASONING FOR 3D DENSE ALIGNMENT AND FACE RECONSTRUCTION |
3041 | MULTI-GRANULARITY HETEROGENEOUS GRAPH FOR DOCUMENT-LEVEL RELATION EXTRACTION |
4749 | MULTI-INITIALIZATION META-LEARNING WITH DOMAIN ADAPTATION |
1543 | MULTILABEL 12-LEAD ELECTROCARDIOGRAM CLASSIFICATION USING BEAT TO SEQUENCE AUTOENCODERS |
5145 | MULTI-LEVEL ADAPTIVE REGION OF INTEREST AND GRAPH LEARNING FOR FACIAL ACTION UNIT RECOGNITION |
1235 | Multi-Level Group Testing with Application to One-Shot Pooled COVID-19 Tests |
3805 | MULTI-LEVEL REVERSIBLE ENCRYPTION FOR ECG SIGNALS USING COMPRESSIVE SENSING |
4237 | MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION |
5358 | MULTIMODAL CROSS- AND SELF-ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION |
3472 | MULTIMODAL EMOTION RECOGNITION WITH CAPSULE GRAPH CONVOLUTIONAL BASED REPRESENTATION FUSION |
5159 | MULTI-MODAL LABEL DEQUANTIZED GAUSSIAN PROCESS LATENT VARIABLE MODEL FOR ORDINAL LABEL ESTIMATION |
3334 | MULTIMODAL METRIC LEARNING FOR TAG-BASED MUSIC RETRIEVAL |
4088 | MULTIMODAL PUNCTUATION PREDICTION WITH CONTEXTUAL DROPOUT |
3284 | MULTI-MODELS FUSION FOR LIGHT FIELD ANGULAR SUPER-RESOLUTION |
2837 | MULTI-OBJECT TRACKING USING POISSON MULTI-BERNOULLI MIXTURE FILTERING FOR AUTONOMOUS VEHICLES |
4538 | MULTI-ORDER ADVERSARIAL REPRESENTATION LEARNING FOR COMPOSED QUERY IMAGE RETRIEVAL |
1865 | MULTIPHISH: MULTI-MODAL FEATURES FUSION NETWORKS FOR PHISHING DETECTION |
1817 | Multiple Auxiliary Networks for Single Blind Image Deblurring |
4771 | MULTIPLE HUMAN TRACKING IN NON-SPECIFIC COVERAGE WITH WEARABLE CAMERAS |
2742 | MULTIPLE-HYPOTHESIS CTC-BASED SEMI-SUPERVISED ADAPTATION OF END-TO-END SPEECH RECOGNITION |
1299 | MULTIPLE-INPUT MULTIPLE-OUTPUT FUSION NETWORK FOR GENERALIZED ZERO-SHOT LEARNING |
4240 | MULTI-RATE ATTENTION ARCHITECTURE FOR FAST STREAMABLE TEXT-TO-SPEECH SPECTRUM MODELING |
1198 | MULTI-SAMPLE ONLINE LEARNING FOR SPIKING NEURAL NETWORKS BASED ON GENERALIZED EXPECTATION MAXIMIZATION |
5368 | MULTI-SCALE AND MULTI-REGION FACIAL DISCRIMINATIVE REPRESENTATION FOR AUTOMATIC DEPRESSION LEVEL DETECTION |
2061 | MULTI-SCALE CASCADE DISPARITY REFINEMENT STEREO NETWORK |
4727 | MULTI-SCALE FEATURE-GUIDED STEREOSCOPIC VIDEO QUALITY ASSESSMENT BASED ON 3D CONVOLUTIONAL NEURAL NETWORK |
3219 | MULTI-SCALE SAMPLE SELECTION BASED ON STATISTICAL CHARACTERISTICS FOR OBJECT DETECTION |
4225 | MULTI-SCALE SPEAKER DIARIZATION WITH NEURAL AFFINITY SCORE FUSION |
3450 | MULTI-SPEAKER EMOTIONAL SPEECH SYNTHESIS WITH FINE-GRAINED PROSODY MODELING |
3440 | Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals |
4065 | Multi-Step Spoken Language Understanding System based on Adversarial Learning |
5555 | MULTISTREAM CNN FOR ROBUST ACOUSTIC MODELING |
1809 | Multi-target DoA estimation with an audio-visual fusion mechanism |
3699 | Multi-task Estimation of Age and Cognitive Decline from Speech |
3075 | MULTITASK LEARNING AND JOINT OPTIMIZATION FOR TRANSFORMER-RNN-TRANSDUCER SPEECH RECOGNITION |
3318 | MULTI-TASK LEARNING VIA SHARING INEXACT LOW-RANK SUBSPACE |
2075 | MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION |
1747 | MULTI-TASK TRANSFORMER WITH INPUT FEATURE RECONSTRUCTION FOR DYSARTHRIC SPEECH RECOGNITION |
5611 | Multi-Task WaveRNN with an Integrated Architecture for Cross-lingual Voice Conversion |
3804 | MULTI-TIER FEDERATED LEARNING FOR VERTICALLY PARTITIONED DATA |
4087 | MULTIVARIATE NON-NEGATIVE MATRIX FACTORIZATION WITH APPLICATION TO ENERGY DISAGGREGATION |
1210 | Multi-Vehicle Velocity Estimation Using IEEE 802.11ad Waveform |
4105 | MULTI-VIEW AUDIO AND MUSIC CLASSIFICATION |
1095 | MULTI-VIEW CONTRASTIVE LEARNING FOR ONLINE KNOWLEDGE DISTILLATION |
2631 | MULTIVIEW SENSING WITH UNKNOWN PERMUTATIONS: AN OPTIMAL TRANSPORT APPROACH |
4039 | MULTIVIEW VARIATIONAL GRAPH AUTOENCODERS FOR CANONICAL CORRELATION ANALYSIS |
2238 | Muse: Multi-modal target speaker extraction with visual cues |
3304 | MUTUAL INFORMATION FLOWS IN A BIVARIATE POINT PROCESS |
5446 | MUTUALLY-CONSTRAINED MONOTONIC MULTIHEAD ATTENTION FOR ONLINE ASR |
2973 | NASA: A Noise-Adaptive and Structure-Aware Learning Framework for Image Deblurring |
1133 | NEAR-OPTIMAL ALGORITHMS FOR PIECEWISE-STATIONARY CASCADING BANDITS |
3742 | NEAR-OPTIMAL RESAMPLING IN PARTICLE FILTERS USING THE ISING ENERGY MODEL |
2494 | NESTED ERROR MAP GENERATION NETWORK FOR NO-REFERENCE IMAGE QUALITY ASSESSMENT |
1462 | NESTED LEARNING FOR MULTI-LEVEL CLASSIFICATION |
4589 | NETWORK AND CONTENT-DEPENDENT BITRATE LADDER ESTIMATION FOR ADAPTIVE BITRATE VIDEO STREAMING |
2407 | NETWORK CLASSIFIERS BASED ON SOCIAL LEARNING |
5605 | NETWORK INFERENCE FROM CONSENSUS DYNAMICS WITH UNKNOWN PARAMETERS |
3133 | NETWORK PRUNING USING LINEAR DEPENDENCY ANALYSIS ON FEATURE MAPS |
3897 | NETWORK TOPOLOGY CHANGE-POINT DETECTION FROM GRAPH SIGNALS WITH PRIOR SPECTRAL SIGNATURES |
1548 | NETWORK TOPOLOGY INFERENCE WITH GRAPHON SPECTRAL PENALTIES |
3607 | NETWORK-AWARE OPTIMAL MICROPHONE CHANNEL SELECTION IN WIRELESS ACOUSTIC SENSOR NETWORKS |
3056 | Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks |
4978 | NEURAL AUDIO FINGERPRINT FOR HIGH-SPECIFIC AUDIO RETRIEVAL BASED ON CONTRASTIVE LEARNING |
4370 | NEURAL INVERSE TEXT NORMALIZATION |
3262 | Neural Kalman Filtering for Speech Enhancement |
3305 | NEURAL LAYERED MIN-SUM DECODING FOR PROTOGRAPH LDPC CODES |
2453 | NEURAL NETWORK-BASED VIRTUAL MICROPHONE ESTIMATOR |
3668 | NEURAL NOISE EMBEDDING FOR END-TO-END SPEECH ENHANCEMENT WITH CONDITIONAL LAYER NORMALIZATION |
5400 | NEURAL UTTERANCE CONFIDENCE MEASURE FOR RNN-TRANSDUCERS AND TWO PASS MODELS |
1820 | NEURO-STEERED MUSIC SOURCE SEPARATION WITH EEG-BASED AUDITORY ATTENTION DECODING AND CONTRASTIVE-NMF |
3561 | NEW VARIANTS OF DFA BASED ON LOESS AND LOWESS METHODS: GENERALIZATION OF THE DETRENDING MOVING AVERAGE |
3417 | NISP: A MULTI-LINGUAL MULTI-ACCENT DATASET FOR SPEAKER PROFILING |
4893 | NLKD: using coarse annotations for semantic segmentation based on knowledge distillation |
1627 | NMF-SAE: AN INTERPRETABLE SPARSE AUTOENCODER FOR HYPERSPECTRAL UNMIXING |
5045 | NNAKF: A NEURAL NETWORK ADAPTED KALMAN FILTER FOR TARGET TRACKING |
5348 | NN-KOG2P: A NOVEL GRAPHEME-TO-PHONEME MODEL FOR KOREAN LANGUAGE |
4758 | NO RELAXATION: GUARANTEED RECOVERY OF FINITE-VALUED SIGNALS FROM UNDERSAMPLED MEASUREMENTS |
4911 | NODE ATTRIBUTE COMPLETION IN KNOWLEDGE GRAPHS WITH MULTI-RELATIONAL PROPAGATION |
2791 | NOISE LEVEL LIMITED SUB-MODELING FOR DIFFUSION PROBABILISTIC VOCODERS |
5091 | NOISE-ASSISTED MULTIVARIATE VARIATIONAL MODE DECOMPOSITION |
3500 | NOISE-ROBUST ADAPTATION CONTROL FOR SUPERVISED ACOUSTIC SYSTEM IDENTIFICATION EXPLOITING A NOISE DICTIONARY |
1744 | NON-AUTOREGRESSIVE SEQUENCE-TO-SEQUENCE VOICE CONVERSION |
1599 | NON-AUTOREGRESSIVE TRANSFORMER ASR WITH CTC-ENHANCED DECODER INPUT |
3886 | NON-COHERENT DOA ESTIMATION OF OFF-GRID SIGNALS WITH UNIFORM CIRCULAR ARRAYS |
1879 | Noncontact Heartbeat Detection by Viterbi Algorithm with Fusion of Beat-Beat Interval and Deep Learning-driven Branch Metrics |
2477 | NON-CONVEX SPARSE DEVIATION MODELING VIA GENERATIVE MODELS |
3541 | NON-INTRUSIVE BINAURAL PREDICTION OF SPEECH INTELLIGIBILITY BASED ON PHONEME CLASSIFICATION |
1471 | NON-ITERATIVE BLIND CALIBRATION OF NESTED ARRAYS WITH ASYMPTOTICALLY OPTIMAL WEIGHTING |
5227 | NONLINEAR STATE-SPACE GENERALIZATIONS OF GRAPH CONVOLUTIONAL NEURAL NETWORKS |
1018 | NON-LOCAL SINGLE IMAGE DE-RAINING WITHOUT DECOMPOSITION |
1234 | Nonnegative Unimodal Matrix Factorization |
5305 | NON-PARALLEL MANY-TO-MANY VOICE CONVERSION BY KNOWLEDGE TRANSFER FROM A TEXT-TO-SPEECH MODEL |
1111 | NON-PARALLEL MANY-TO-MANY VOICE CONVERSION USING LOCAL LINGUISTIC TOKENS |
4844 | NON-RECURSIVE GRAPH CONVOLUTIONAL NETWORKS |
4108 | NON-SINGULAR ADVERSARIAL ROBUSTNESS OF NEURAL NETWORKS |
2373 | NONSTATIONARY PORTFOLIOS: DIVERSIFICATION IN THE SPECTRAL DOMAIN |
4755 | NO-REFERENCE STEREOSCOPIC IMAGE QUALITY ASSESSMENT BASED ON THE HUMAN VISUAL SYSTEM |
5613 | NOVEL ARCHITECTURES FOR UNSUPERVISED INFORMATION BOTTLENECK BASED SPEAKER DIARIZATION OF MEETINGS |
1165 | NUMERICAL SOLUTION OF STOCHASTIC DIFFERENTIAL EQUATIONS IN STIEFEL MANIFOLDS VIA TANGENT SPACE PARAMETRIZATION |
1070 | OAS-NET: OCCLUSION AWARE SAMPLING NETWORK FOR ACCURATE OPTICAL FLOW |
3124 | Object-Oriented Relational Distillation for Object Detection |
4206 | On a Guided Nonnegative Matrix Factorization |
2955 | ON DISTRIBUTED COMPOSITE TESTS WITH DEPENDENT OBSERVATIONS IN WSN |
3290 | ON INFORMATION ASYMMETRY IN ONLINE REINFORCEMENT LEARNING |
4419 | ON LOSS FUNCTIONS FOR DEEP-LEARNING BASED T60 ESTIMATION |
2213 | ON OVERFITTING IN DISCRETE SUPER-RESOLUTION RECOVERY |
3140 | ON PERMUTATION INVARIANT TRAINING FOR SPEECH SOURCE SEPARATION |
5252 | ON SCALING CONTRASTIVE REPRESENTATIONS FOR LOW-RESOURCE SPEECH RECOGNITION |
3774 | On Strategic Jamming in Distributed Detection Networks |
1607 | ON THE ACCURACY LIMIT OF JOINT TIME-DELAY/DOPPLER/ACCELERATION ESTIMATION WITH A BAND-LIMITED SIGNAL |
1076 | ON THE ADVERSARIAL ROBUSTNESS OF PRINCIPAL COMPONENT ANALYSIS |
5427 | On the Asymptotic Performance of One-Bit Co-Array-Based MUSIC |
1167 | ON THE CAMERA POSITION DITHERING IN VISUAL 3D RECONSTRUCTION |
5538 | ON THE CONVERGENCE OF RANDOMIZED BREGMAN COORDINATE DESCENT FOR NON-LIPSCHITZ COMPOSITE PROBLEMS |
1290 | ON THE DESIGN OF SQUARE DIFFERENTIAL MICROPHONE ARRAYS WITH A MULTISTAGE STRUCTURE |
3954 | ON THE DETECTION OF PITCH-SHIFTED VOICE: MACHINES AND HUMAN LISTENERS |
4257 | ON THE EFFECT OF SPATIAL CORRELATION ON DISTRIBUTED ENERGY DETECTION OF A STOCHASTIC PROCESS |
5599 | On the Identifiability of Transform Learning for Non-Negative Matrix Factorization |
4796 | ON THE MARGINAL BENEFIT OF ACTIVE LEARNING: DOES SELF-SUPERVISION EAT ITS CAKE? |
3760 | On the Optimality of Backward Regression: Sparse Recovery and Subset Selection |
4658 | On the Performance-Complexity Tradeoff in Stochastic Greedy Weak Submodular Optimization |
3854 | ON THE POWER OF DEEP BUT NAIVE PARTIAL LABEL LEARNING |
2876 | ON THE PREDICTABILITY OF HRTFS FROM EAR SHAPES USING DEEP NETWORKS |
3519 | ON THE PREPARATION AND VALIDATION OF A LARGE-SCALE DATASET OF SINGING TRANSCRIPTION |
5187 | ON THE RELATIONSHIP BETWEEN SPEECH-BASED BREATHING SIGNAL PREDICTION EVALUATION MEASURES AND BREATHING PARAMETERS ESTIMATION |
2814 | ON THE ROLE OF VISUAL CUES IN AUDIOVISUAL SPEECH ENHANCEMENT |
3601 | ON THE STABILITY OF GRAPH CONVOLUTIONAL NEURAL NETWORKS UNDER EDGE REWIRING |
4836 | ONE SHOT LEARNING FOR SPEECH SEPARATION |
4311 | ONE-BIT AUTOCORRELATION ESTIMATION WITH NON-ZERO THRESHOLDS |
3983 | ONE-BIT COMPRESSED SENSING USING UNTRAINED NETWORK PRIOR |
3816 | ONE-SHOT CONDITIONAL AUDIO FILTERING OF ARBITRARY SOUNDS |
2217 | ONE-SHOT VOICE CONVERSION BASED ON SPEAKER AWARE MODULE |
5061 | Online Antenna Selection for Enhanced DOA Estimation |
5581 | Online Automatic Speech Recognition With Listen, Attend and Spell Model |
3571 | ONLINE CLASSIFICATION OF DYNAMIC MULTILAYER-NETWORK TIME SERIES IN RIEMANNIAN MANIFOLDS |
5338 | Online Dynamic Window (ODW) Assisted 2-stage LSTM Indoor Localization for Smart Phones |
1654 | ONLINE HYPER-PARAMETER TUNING FOR THE CONTEXTUAL BANDIT |
3838 | ONLINE LEARNING OF TIME-VARYING SIGNALS AND GRAPHS |
3951 | Online Multi-hop Information based Kernel Learning over Graphs |
5598 | ONLINE SPECTROGRAM INVERSION FOR LOW-LATENCY AUDIO SOURCE SEPARATION |
3970 | ONLINE TIME-VARYING TOPOLOGY IDENTIFICATION VIA PREDICTION-CORRECTION ALGORITHMS |
4494 | ONLINE UNSUPERVISED LEARNING USING ENSEMBLE GAUSSIAN PROCESSES WITH RANDOM FEATURES |
3932 | OPTIMAL ATTACKING STRATEGY AGAINST ONLINE REPUTATION SYSTEMS WITH CONSIDERATION OF THE MESSAGE-BASED PERSUASION PHENOMENON |
1848 | Optimal Detection in the Presence of Non-Gaussian Jamming |
3108 | OPTIMAL IMPORTANCE SAMPLING FOR FEDERATED LEARNING |
5278 | OPTIMAL QUESTIONNAIRES FOR SCREENING OF STRATEGIC AGENTS |
2726 | OPTIMAL SELECTION OF MATRIX SHAPE AND DECOMPOSITION SCHEME FOR NEURAL NETWORK COMPRESSION |
1055 | OPTIMAL TOA LOCALIZATION FOR MOVING SENSOR IN ASYMMETRIC NETWORK |
2721 | OPTIMIZE WHAT MATTERS: TRAINING DNN-HMM KEYWORD SPOTTING MODEL USING END METRIC |
3665 | OPTIMIZING COVERAGE AND CAPACITY IN CELLULAR NETWORKS USING MACHINE LEARNING |
4666 | OPTIMIZING SHORT-TIME FOURIER TRANSFORM PARAMETERS VIA GRADIENT DESCENT |
1536 | OPTIMUM FEATURE ORDERING FOR DYNAMIC INSTANCE–WISE JOINT FEATURE SELECTION AND CLASSIFICATION |
1153 | ORDERED RELIABILITY BITS GUESSING RANDOM ADDITIVE NOISE DECODING |
5034 | ORTHOGONALITY AND ZERO DC TRADEOFFS IN BIORTHOGONAL GRAPH FILTERBANKS |
4098 | Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder |
2079 | OUTLIER-ROBUST KERNEL HIERARCHICAL-OPTIMIZATION RLS ON A BUDGET WITH AFFINE CONSTRAINTS |
5170 | OVERCOMING MEASUREMENT INCONSISTENCY IN DEEP LEARNING FOR LINEAR INVERSE PROBLEMS: APPLICATIONS IN MEDICAL IMAGING |
5000 | PARAGRAPH LEVEL MULTI-PERSPECTIVE CONTEXT MODELING FOR QUESTION GENERATION |
3827 | PARALLEL ITERATED EXTENDED AND SIGMA-POINT KALMAN SMOOTHERS |
3566 | PARALLEL TACOTRON: NON-AUTOREGRESSIVE AND CONTROLLABLE TTS |
3306 | PARALLEL WAVEFORM SYNTHESIS BASED ON GENERATIVE ADVERSARIAL NETWORKS WITH VOICING-AWARE CONDITIONAL DISCRIMINATORS |
2329 | Parameter Estimation for Coherent Passive MIMO Radar with Unknown Signals under Direct Path Influence |
1616 | PARAMETER ESTIMATION FOR STUDENT'S t VAR MODEL WITH MISSING DATA |
4650 | Parameter Identifiability of Spatial-Smoothing-Based Bistatic MIMO Radar |
4473 | Parametric Spectral Filters for Fast Converging, Scalable Convolutional Neural Networks |
1979 | PART-ALIGNED NETWORK WITH BACKGROUND FOR MISALIGNED PERSON SEARCH |
1341 | PARTIAL FEATURE AGGREGATION NETWORK FOR REAL-TIME OBJECT COUNTING |
1495 | PARTIALLY OVERLAPPED INFERENCE FOR LONG-FORM SPEECH RECOGNITION |
3773 | Particle Gibbs Sampling for Regime-Switching State-Space Models |
4973 | PATCH DECODER-SIDE DEPTH ESTIMATION IN MPEG IMMERSIVE VIDEO |
2984 | PATNET : A PHONEME-LEVEL AUTOREGRESSIVE TRANSFORMER NETWORK FOR SPEECH SYNTHESIS |
4227 | PAUSE-ENCODED LANGUAGE MODELS FOR RECOGNITION OF ALZHEIMER'S DISEASE AND EMOTION |
3484 | PD-GAN: PERCEPTUAL-DETAILS GAN FOR EXTREMELY NOISY LOW LIGHT IMAGE ENHANCEMENT |
3931 | PERCEPTUAL LOSS BASED SPEECH DENOISING WITH AN ENSEMBLE OF AUDIO PATTERN RECOGNITION AND SELF-SUPERVISED MODELS |
4502 | Perceptual Quality Assessment for Recognizing True and Pseudo 4K Content |
4432 | PERFORMANCE ANALYSIS OF SPATIAL AND FREQUENCY DOMAIN INDEX-MODULATED RECONFIGURABLE INTELLIGENT METASURFACES |
1645 | PERIODIC SIGNAL DENOISING: AN ANALYSIS-SYNTHESIS FRAMEWORK BASED ON RAMANUJAN FILTER BANKS AND DICTIONARIES |
5255 | PERIODNET: A NON-AUTOREGRESSIVE WAVEFORM GENERATION MODEL WITH A STRUCTURE SEPARATING PERIODIC AND APERIODIC COMPONENTS |
2784 | PERSONALIZATION STRATEGIES FOR END-TO-END SPEECH RECOGNITION SYSTEMS |
3228 | PERSONALIZED HRTF MODELING USING DNN-AUGMENTED BEM |
2371 | PHASE RECOVERY WITH BREGMAN DIVERGENCES FOR AUDIO SOURCE SEPARATION |
5326 | PHASE TRANSITIONS FOR ONE-VS-ONE AND ONE-VS-ALL LINEAR SEPARABILITY IN MULTICLASS GAUSSIAN MIXTURES |
3888 | PHONE DISTRIBUTION ESTIMATION FOR LOW RESOURCE LANGUAGES |
4200 | Phoneme based Neural Transducer for Large Vocabulary Speech Recognition |
4559 | Phoneme-Based Distribution Regularization for Speech Enhancement |
4030 | PHYSICAL-LAYER SECURITY VIA DISTRIBUTED BEAMFORMING IN THE PRESENCE OF ADVERSARIES WITH UNKNOWN LOCATIONS |
1504 | PIPELINE SAFETY EARLY WARNING METHOD FOR DISTRIBUTED SIGNAL USING BILINEAR CNN AND LIGHTGBM |
5290 | PITCH-TIMBRE DISENTANGLEMENT OF MUSICAL INSTRUMENT SOUNDS BASED ON VAE-BASED METRIC LEARNING |
2395 | PLANAR ARRAY GEOMETRY OPTIMIZATION FOR REGION SOUND ACQUISITION |
3883 | PLAYING A PART: SPEAKER VERIFICATION AT THE MOVIES |
4048 | Plug-And-Play Learned Gaussian-mixture Approximate Message Passing |
3315 | POINT OF CARE IMAGE ANALYSIS FOR COVID19 |
1284 | POINTER NETWORKS FOR ARBITRARY-SHAPED TEXT SPOTTING |
3130 | POLA: ONLINE TIME SERIES PREDICTION BY ADAPTIVE LEARNING RATES |
4528 | POLICY AUGMENTATION: AN EXPLORATION STRATEGY FOR FASTER CONVERGENCE OF DEEP REINFORCEMENT LEARNING ALGORITHMS |
1218 | POLYNOMIAL MATRIX EIGENVALUE DECOMPOSITION OF SPHERICAL HARMONICS FOR SPEECH ENHANCEMENT |
5614 | POPS: POLICY PRUNING AND SHRINKING FOR DEEP REINFORCEMENT LEARNING |
5463 | PORTABLE PHOTOGLOTTOGRAPHY FOR MONITORING VOCAL FOLD VIBRATIONS IN SPEECH PRODUCTION |
4201 | POSITNN: TRAINING DEEP NEURAL NETWORKS WITH MIXED LOW-PRECISION POSIT |
3470 | PPG-BASED SINGING VOICE CONVERSION WITH ADVERSARIAL REPRESENTATION LEARNING |
4390 | Prediction of EGFR Mutation Status in Lung Adenocarcinoma using Multi-source Feature Representations |
2677 | PREDICTION OF OBJECT GEOMETRY FROM ACOUSTIC SCATTERING USING CONVOLUTIONAL NEURAL NETWORKS |
3992 | PREDICTIVE CODING FOR LOSSLESS DATASET COMPRESSION |
3286 | PRE-TRAINING TRANSFORMER DECODER FOR END-TO-END ASR MODEL WITH UNPAIRED TEXT DATA |
3564 | PREVENTING EARLY ENDPOINTING FOR ONLINE AUTOMATIC SPEECH RECOGNITION |
2581 | PRIVACY-ACCURACY TRADE-OFF OF INFERENCE AS SERVICE |
3185 | Privacy-Preserving Cloud-based DNN Inference |
1818 | PRIVACY-PRESERVING NEAR NEIGHBOR SEARCH VIA SPARSE CODING WITH AMBIGUATION |
2400 | PRIVACY-PRESERVING OPTIMAL INSULIN DOSING DECISION |
5392 | PRIVATE WIRELESS FEDERATED LEARNING WITH ANONYMOUS OVER-THE-AIR COMPUTATION |
2187 | PROBABILISTIC GRAPH NEURAL NETWORKS FOR TRAFFIC SIGNAL CONTROL |
2695 | Probability of Resolution of G-MUSIC: An Asymptotic Approach |
4414 | PROBING ACOUSTIC REPRESENTATIONS FOR PHONETIC PROPERTIES |
4007 | PROCESSING PIPELINES FOR EFFICIENT, PHYSICALLY-ACCURATE SIMULATION OF MICROPHONE ARRAY SIGNALS IN DYNAMIC SOUND SCENES |
1766 | Progressive Co-teaching for Ambiguous Speech Emotion Recognition |
2250 | PROGRESSIVE DIALOGUE STATE TRACKING FOR MULTI-DOMAIN DIALOGUE SYSTEMS |
4304 | PROGRESSIVE MULTI-STAGE FEATURE MIX FOR PERSON RE-IDENTIFICATION |
4965 | PROGRESSIVE SPATIO-TEMPORAL GRAPH CONVOLUTIONAL NETWORK FOR SKELETON-BASED HUMAN ACTION RECOGNITION |
3104 | PROGRESSIVE VOICE TRIGGER DETECTION: ACCURACY VS LATENCY |
3464 | PROSODIC CLUSTERING FOR PHONEME-LEVEL PROSODY CONTROL IN END-TO-END SPEECH SYNTHESIS |
3119 | PROSODIC REPRESENTATION LEARNING AND CONTEXTUAL SAMPLING FOR NEURAL TEXT-TO-SPEECH |
4010 | PROTOTYPE-BASED PERSONALIZED PRUNING |
1812 | PROTOTYPICAL NETWORKS FOR DOMAIN ADAPTATION IN ACOUSTIC SCENE CLASSIFICATION |
5098 | PROVABLY FAST ASYNCHRONOUS AND DISTRIBUTED ALGORITHMS FOR PAGERANK CENTRALITY COMPUTATION |
2259 | PRUNING OF CONVOLUTIONAL NEURAL NETWORKS USING ISING ENERGY MODEL |
1667 | PUSHING THE LIMIT OF PHASE OFFSET FOR CONTACTLESS SENSING USING COMMODITY WIFI |
4449 | PUSHING THE LIMIT OF TYPE I CODEBOOK FOR FDD MASSIVE MIMO BEAMFORMING: A CHANNEL COVARIANCE RECONSTRUCTION APPROACH |
1703 | Pyramid U-Net for Retinal Vessel Segmentation |
4932 | QOE-DRIVEN AND TILE-BASED ADAPTIVE STREAMING FOR POINT CLOUDS |
5089 | QUERY-BY-EXAMPLE KEYWORD SPOTTING SYSTEM USING MULTI-HEAD ATTENTION AND SOFTTRIPLE LOSS |
5201 | QUERYD: A VIDEO DATASET WITH HIGH-QUALITY TEXT AND AUDIO NARRATIONS |
4017 | QUICKEST CHANGE DETECTION WITH TIME INCONSISTENT ANTICIPATORY AGENTS IN CYBER-PHYSICAL SYSTEMS |
4603 | Quickest Joint Detection and Classification of Faults In Statistically Periodic Processes |
1262 | RADAR CLUTTER CLASSIFICATION USING EXPECTATION-MAXIMIZATION METHOD |
1522 | RADIO FREQUENCY BASED HEART RATE VARIABILITY MONITORING |
3149 | RANDOM PROJECTION STREAMS FOR (WEIGHTED) NONNEGATIVE MATRIX FACTORIZATION |
2640 | RANGE GUIDED DEPTH REFINEMENT AND UNCERTAINTY-AWARE AGGREGATION FOR VIEW SYNTHESIS |
5317 | RANK-REVEALING BLOCK-TERM DECOMPOSITION FOR TENSOR COMPLETION |
3930 | RATE 1 QUASI ORTHOGONAL UNIVERSAL TRANSMISSION AND COMBINING FOR MIMO SYSTEMS ACHIEVING FULL DIVERSITY |
3643 | Rate-distortion optimized motion estimation for on-the-sphere compression of 360 videos |
3783 | RAW DATA PROCESSING FOR PRACTICAL TIME-OF-FLIGHT SUPER-RESOLUTION |
2703 | Real Image Super-Resolution using Token Based Contextual Attention |
3994 | REAL NUMBER SIGNAL PROCESSING CAN DETECT DENIAL-OF-SERVICE ATTACKS |
2812 | REAL VERSUS FAKE 4K - AUTHENTIC RESOLUTION ASSESSMENT |
1886 | REAL-TIME DENOISING AND DEREVERBERATION WTIH TINY RECURRENT U-NET |
2656 | REAL-TIME INTERAURAL TIME DELAY ESTIMATION VIA ONSET DETECTION |
4484 | Real-Time Radio Modulation Classification with an LSTM Auto-Encoder |
3496 | REAL-TIME SPEECH ENHANCEMENT FOR MOBILE COMMUNICATION BASED ON DUAL-CHANNEL COMPLEX SPECTRAL MAPPING |
3186 | Real-time Speech Frequency Bandwidth Extension |
1206 | REAL-TIME SYNCHRONIZATION IN NEURAL NETWORKS FOR MULTIVARIATE TIME SERIES ANOMALY DETECTION |
1271 | Recent Advances in Arabic Syntactic Diacritics Restoration |
3904 | RECENT DEVELOPMENTS ON ESPNET TOOLKIT BOOSTED BY CONFORMER |
3096 | RECOGNITION OF DYNAMIC HAND GESTURE BASED ON MM-WAVE FMCW RADAR MICRO-DOPPLER SIGNATURES |
3451 | RECURRENT PHASE RECONSTRUCTION USING ESTIMATED PHASE DERIVATIVES FROM DEEP NEURAL NETWORKS |
3111 | RECURSIVE INPUT AND STATE ESTIMATION: A GENERAL FRAMEWORK FOR LEARNING FROM TIME SERIES WITH MISSING DATA |
2760 | REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling |
3775 | REDUCED-COMPLEXITY CHANNEL ESTIMATION BY HIERARCHICAL INTERPOLATION EXPLOITING SPARSITY FOR MASSIVE MIMO SYSTEMS WITH UNIFORM RECTANGULAR ARRAY |
2724 | REDUCED-COMPLEXITY MODULAR POLYNOMIAL MULTIPLICATION FOR R-LWE CRYPTOSYSTEMS |
2914 | REDUCING MODAL ERROR PROPAGATION THROUGH CORRECTING MISMATCHED MICROPHONE GAINS USING RAPID |
3196 | REDUCING SPELLING INCONSISTENCIES IN CODE-SWITCHING ASR USING CONTEXTUALIZED CTC LOSS |
1496 | REFINEMENT OF DIRECTION OF ARRIVAL ESTIMATORS BY MAJORIZATION-MINIMIZATION OPTIMIZATION ON THE ARRAY MANIFOLD |
2767 | REFINING AUTOMATIC SPEECH RECOGNITION SYSTEM FOR OLDER ADULTS |
1421 | REFLECTANCE-ORIENTED PROBABILISTIC EQUALIZATION FOR IMAGE ENHANCEMENT |
4416 | REGRESSION OR CLASSIFICATION? NEW METHODS TO EVALUATE NO-REFERENCE PICTURE AND VIDEO QUALITY MODELS |
2672 | REGULARIZED RECOVERY BY MULTI-ORDER PARTIAL HYPERGRAPH TOTAL VARIATION |
5012 | Reinforcement Stacked Learning with Semantic-Associated Attention for Visual Question Answering |
2200 | Relaxed Wasserstein with Applications to GANs |
3209 | RELIABILITY ASSESSMENT OF SINGING VOICE F0-ESTIMATES USING MULTIPLE ALGORITHMS |
3609 | RELYING ON A RATE CONSTRAINT TO REDUCE MOTION ESTIMATION COMPLEXITY |
2154 | REPAC: RELIABLE ESTIMATION OF PHASE-AMPLITUDE COUPLING IN BRAIN NETWORKS |
2446 | REPLACING HUMAN AUDIO WITH SYNTHETIC AUDIO FOR ON-DEVICE UNSPOKEN PUNCTUATION PREDICTION |
2972 | REPLAY AND SYNTHETIC SPEECH DETECTION WITH RES2NET ARCHITECTURE |
4939 | Replay-Attack Detection using Features with Adaptive Spectro-Temporal Resolution |
4883 | REPRESENTATION LEARNING FOR SPEECH RECOGNITION USING FEEDBACK BASED RELEVANCE WEIGHTING |
3346 | REPRESENTATION LEARNING WITH SPECTRO-TEMPORAL-CHANNEL ATTENTION FOR SPEECH EMOTION RECOGNITION |
1633 | REPRESENTATIVE LOCAL FEATURE MINING FOR FEW-SHOT LEARNING |
1204 | Resolution Limits of 20 Questions Search Strategies for Moving Targets |
4832 | RESPIPE: RESILIENT MODEL-DISTRIBUTED DNN TRAINING AT EDGE NETWORKS |
3502 | REST: Robust Learned Shrinkage-Thresholding unrolled network |
2475 | RETHINKING THE SEPARATION LAYERS IN SPEECH SEPARATION NETWORKS |
4908 | REVERB CONVERSION OF MIXED VOCAL TRACKS USING AN END-TO-END CONVOLUTIONAL DEEP NEURAL NETWORK |
3129 | REVERSIBLE DATA HIDING IN JPEG IMAGES FOR PRIVACY PROTECTION |
5228 | REWEIGHTED DYNAMIC GROUP CONVOLUTION |
1960 | RGLN: Robust Residual Graph Learning Networks via Similarity-Preserving Mapping on Graphs |
4326 | Riemannian Geometric Optimization Methods for Joint Design of Transmit Sequence and Receive Filter of MIMO Radar |
1374 | RIEMANNIAN GEOMETRY ON CONNECTIVITY FOR CLINICAL BCI |
1457 | RIEMANNIAN GEOMETRY-BASED DECODING OF THE DIRECTIONAL FOCUS OF AUDITORY ATTENTION USING EEG |
3958 | RIS-AIDED JOINT LOCALIZATION AND SYNCHRONIZATION WITH A SINGLE-ANTENNA MMWAVE RECEIVER |
2636 | RNN TRANSDUCER MODELS FOR SPOKEN LANGUAGE UNDERSTANDING |
4728 | RNN-T BASED OPEN-VOCABULARY KEYWORD SPOTTING IN MANDARIN WITH MULTI-LEVEL DETECTION |
4851 | Robust Binary Loss for Multi-category Classification with Label Noise |
1436 | ROBUST DEEP REINFORCEMENT LEARNING FOR UNDERWATER NAVIGATION WITH UNKNOWN DISTURBANCES |
2688 | ROBUST DEVICE-FREE PROXIMITY DETECTION USING WIFI |
3188 | ROBUST DOMAIN-FREE DOMAIN GENERALIZATION WITH CLASS-AWARE ALIGNMENT |
1784 | Robust estimation of high-order phase dynamics using Variational Bayes inference |
4680 | ROBUST GRAPH AUTOENCODER FOR HYPERSPECTRAL ANOMALY DETECTION |
5036 | ROBUST GRAPH-FILTER IDENTIFICATION WITH GRAPH DENOISING REGULARIZATION |
4194 | ROBUST LATENT REPRESENTATIONS VIA CROSS-MODAL TRANSLATION AND ALIGNMENT |
5349 | ROBUST MAML: PRIORITIZATION TASK BUFFER WITH ADAPTIVE LEARNING PROCESS FOR MODEL-AGNOSTIC META-LEARNING |
2380 | Robust PCA through Maximum Correntropy Power Iterations |
1990 | Robust Recursive Least M-estimate Adaptive Filter for the Identification of Low-Rank Acoustic Systems |
2419 | ROBUST SPATIAL-TEMPORAL CORRELATION MODEL FOR BACKGROUND INITIALIZATION IN SEVERE SCENE |
1506 | ROBUST STEERABLE DIFFERENTIAL BEAMFORMERS WITH NULL CONSTRAINTS FOR CONCENTRIC CIRCULAR MICROPHONE ARRAYS |
2076 | ROBUST STFT DOMAIN MULTI-CHANNEL ACOUSTIC ECHO CANCELLATION WITH ADAPTIVE DECORRELATION OF THE REFERENCE SIGNALS |
1258 | Robust Voice Activity Detection Using A Masked Auditory Encoder Based Convolutional Neural Network |
4353 | ROBUSTNESS AND DIVERSITY SEEKING DATA-FREE KNOWLEDGE DISTILLATION |
4583 | ROLE AWARE MULTI-PARTY DIALOGUE QUESTION ANSWERING |
2573 | Room adaptive conditioning method for sound event classification in reverberant environments |
1473 | ROOM IMPULSE RESPONSE INTERPOLATION FROM A SPARSE SET OF MEASUREMENTS USING A MODAL ARCHITECTURE |
4785 | Rotation Invariance Analysis of Local Convolutional Features in Image Retrieval |
4523 | ROTATION-ROBUST BEAMFORMING BASED ON SOUND FIELD INTERPOLATION WITH REGULARLY CIRCULAR MICROPHONE ARRAY |
2557 | RoutingGAN: Routing Age Progression and Regression with Disentangled Learning |
4590 | Rule-embedded network for audio-visual voice activity detection in live musical video streams |
3350 | SAFE SCREENING FOR SPARSE REGRESSION WITH THE KULLBACK-LEIBLER DIVERGENCE |
1160 | SAGA: SPARSE ADVERSARIAL ATTACK ON EEG-BASED BRAIN COMPUTER INTERFACE |
3144 | SALIENCY-DRIVEN VERSATILE VIDEO CODING FOR NEURAL OBJECT DETECTION |
5132 | SAMPLE EFFICIENT SUBSPACE-BASED REPRESENTATIONS FOR NONLINEAR META-LEARNING |
1336 | SANDGLASSET: A LIGHT MULTI-GRANULARITY SELF-ATTENTIVE NETWORK FOR TIME-DOMAIN SPEECH SEPARATION |
1733 | SA-NET: SHUFFLE ATTENTION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS |
3845 | SANET++: ENHANCED SCALE AGGREGATION WITH DENSELY CONNECTED FEATURE FUSION FOR CROWD COUNTING |
2773 | SAPAUGMENT: LEARNING A SAMPLE ADAPTIVE POLICY FOR DATA AUGMENTATION |
2575 | SAR IMAGE AUTOFOCUSING USING WIRTINGER CALCULUS AND CAUCHY REGULARIZATION |
4028 | SCALABLE AND DISTRIBUTED MMSE ALGORITHMS FOR UPLINK RECEIVE COMBINING IN CELL-FREE MASSIVE MIMO SYSTEMS |
2240 | SCALABLE DISCRIMINATIVE DISCRETE HASHING FOR LARGE-SCALE CROSS-MODAL RETRIEVAL |
1247 | SCALABLE MULTILEVEL QUANTIZATION FOR DISTRIBUTED DETECTION |
3755 | SCALABLE PRIVACY-PRESERVING DISTRIBUTED EXTREMELY RANDOMIZED TREES FOR STRUCTURED DATA WITH MULTIPLE COLLUDING PARTIES |
1223 | SCALABLE REINFORCEMENT LEARNING FOR ROUTING IN AD-HOC NETWORKS BASED ON PHYSICAL-LAYER ATTRIBUTES |
1119 | SCALED FAST NESTED KEY EQUATION SOLVER FOR GENERALIZED INTEGRATED INTERLEAVED BCH DECODERS |
2152 | Scene Completeness-Aware Lidar Depth Completion for Driving Scenario |
3733 | SCORE-BASED CHANGE DETECTION FOR GRADIENT-BASED LEARNING MACHINES |
1433 | SEARCHING FOR ANOMALIES WITH MULTIPLE PLAYS UNDER DELAY AND SWITCHING COSTS |
3136 | SECRET KEY GENERATION OVER WIRELESS CHANNELS USING SHORT BLOCKLENGTH MULTILEVEL SOURCE POLAR CODING |
3033 | SECURE UAV COMMUNICATIONS UNDER UNCERTAIN EAVESDROPPERS LOCATIONS |
4857 | SEEHEAR: SIGNER DIARISATION AND A NEW DATASET |
4278 | SEEN AND UNSEEN EMOTIONAL STYLE TRANSFER FOR VOICE CONVERSION WITH A NEW EMOTIONAL SPEECH DATASET |
4493 | SEGMENTAL DTW: A PARALLELIZABLE ALTERNATIVE TO DYNAMIC TIME WARPING |
4398 | SEGREGATION IN SOCIAL NETWORKS: MARKOV BRIDGE MODELS AND ESTIMATION |
3975 | SEIZURE DETECTION USING POWER SPECTRAL DENSITY VIA HYPERDIMENSIONAL COMPUTING |
1832 | SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT |
2570 | SELF-ATTENTIVE VAD: CONTEXT-AWARE DETECTION OF VOICE FROM NOISE |
2377 | SELF-AUGMENTED MULTI-MODAL FEATURE EMBEDDING |
1698 | Self-Convolution: A Highly-Efficient Operator for Non-Local Image Restoration |
4056 | SELFGAIT: A SPATIOTEMPORAL REPRESENTATION LEARNING METHOD FOR SELF-SUPERVISED GAIT RECOGNITION |
3392 | SELF-INFERENCE OF OTHERS' POLICIES FOR HOMOGENEOUS AGENTS IN COOPERATIVE MULTI-AGENT REINFORCEMENT LEARNING |
2214 | SELF-SUPERVISED DEPTH ESTIMATION VIA IMPLICIT CUES FROM VIDEOS |
4821 | SELF-SUPERVISED LEARNING BASED DOMAIN ADAPTATION FOR ROBUST SPEAKER VERIFICATION |
3313 | SELF-SUPERVISED LEARNING FOR FEW-SHOT IMAGE CLASSIFICATION |
2284 | SELF-SUPERVISED LEARNING FOR SLEEP STAGE CLASSIFICATION WITH PREDICTIVE AND DISCRIMINATIVE CONTRASTIVE CODING |
2762 | Self-supervised text-independent speaker verification using prototypical momentum contrastive learning |
3348 | Self-Supervised VQ-VAE For One-Shot Music Style Transfer |
3894 | Self-training and Pre-training are complementary for Speech Recognition |
3798 | Self-Training for Sound Event Detection in Audio Mixtures |
2239 | SEMANTIC IMAGE SYNTHESIS FROM INACCURATE AND COARSE MASKS |
1925 | SEMANTIC-AWARE CONTEXT AGGREGATION FOR IMAGE INPAINTING |
3473 | SEMANTIC-AWARE UNPAIRED IMAGE-TO-IMAGE TRANSLATION FOR URBAN SCENE IMAGES |
5593 | SEMIDEFINITE PROGRAMMING METHODS FOR ALLEVIATING CLOCK SYNCHRONIZATION BIAS AND SENSOR POSITION ERRORS IN TDOA LOCALIZATION |
3437 | SEMI-SUPERVISED BATCH ACTIVE LEARNING VIA BILEVEL OPTIMIZATION |
1546 | SEMI-SUPERVISED FEATURE EMBEDDING FOR DATA SANITIZATION IN REAL-WORLD EVENTS |
5243 | SEMI-SUPERVISED LEARNING FOR SINGING SYNTHESIS TIMBRE |
4573 | SEMI-SUPERVISED MULTIMODAL IMAGE TRANSLATION FOR MISSING MODALITY IMPUTATION |
2942 | SEMI-SUPERVISED SINGING VOICE SEPARATION WITH NOISY SELF-TRAINING |
1405 | SEMI-SUPERVISED SKIN LESION SEGMENTATION WITH LEARNING MODEL CONFIDENCE |
4488 | SEMI-SUPERVISED SPEECH RECOGNITION VIA GRAPH-BASED TEMPORAL CLASSIFICATION |
3543 | SEMI-SUPERVISED SPOKEN LANGUAGE UNDERSTANDING VIA SELF-SUPERVISED SPEECH AND LANGUAGE MODEL PRETRAINING |
2020 | Semi-supervised Time Series Classification by Temporal Relation Prediction |
3485 | SENONE-AWARE ADVERSARIAL MULTI-TASK TRAINING FOR UNSUPERVISED CHILD TO ADULT SPEECH ADAPTATION |
3667 | SENSOR NETWORKS TDOA SELF-CALIBRATION: 2D COMPLEXITY ANALYSIS AND SOLUTIONS |
3625 | SENTENCE BOUNDARY AUGMENTATION FOR NEURAL MACHINE TRANSLATION ROBUSTNESS |
4800 | SENTIMENT INJECTED ITERATIVELY CO-INTERACTIVE NETWORK FOR SPOKEN LANGUAGE UNDERSTANDING |
3680 | SEP-28K: A DATASET FOR STUTTERING EVENT DETECTION FROM PODCASTS WITH PEOPLE WHO STUTTER |
4718 | SEPNET: A DEEP SEPARATION MATRIX PREDICTION NETWORK FOR MULTICHANNEL AUDIO SOURCE SEPARATION |
4913 | SEQ-CPC : SEQUENTIAL CONTRASTIVE PREDICTIVE CODING FOR AUTOMATIC SPEECH RECOGNITION |
4122 | SEQUENCE-LEVEL SELF-TEACHING REGULARIZATION |
4187 | SEQUENCE-TO-SEQUENCE SINGING VOICE SYNTHESIS WITH PERCEPTUAL ENTROPY LOSS |
3616 | Sequential Adversarial Anomaly Detection with Deep Fourier Kernel |
1890 | SERN: STANCE EXTRACTION AND REASONING NETWORK FOR FAKE NEWS DETECTION |
1158 | SESQA: SEMI-SUPERVISED LEARNING FOR SPEECH QUALITY ASSESSMENT |
4129 | SHAPELET BASED VISUAL ASSESSMENT OF CLUSTER TENDENCY IN ANALYZING COMPLEX UPPER LIMB MOTION |
5261 | SHORT-TIME SPECTRAL AGGREGATION FOR SPEAKER EMBEDDING |
3969 | SHOW AND SPEAK: DIRECTLY SYNTHESIZE SPOKEN DESCRIPTION OF IMAGES |
1107 | SIAMESE CAPSULE NETWORK FOR END-TO-END SPEAKER RECOGNITION IN THE WILD |
1908 | SIG2SIG : SIGNAL TRANSLATION NETWORKS TO TAKE THE REMAINS OF THE PAST |
1464 | SIGN LANGUAGE SEGMENTATION WITH TEMPORAL CONVOLUTIONAL NETWORKS |
3193 | SIGNATURE FEATURE MARKING ENHANCED IRM FRAMEWORK FOR DRONE IMAGE ANALYSIS IN PRECISION AGRICULTURE |
4380 | SIMILARITY ANALYSIS OF SELF-SUPERVISED SPEECH REPRESENTATIONS |
2465 | SIML: SIEVED MAXIMUM LIKELIHOOD FOR ARRAY SIGNAL PROCESSING |
3403 | SIMPLEFLAT: A SIMPLE WHOLE-NETWORK PRE-TRAINING APPROACH FOR RNN TRANSDUCER-BASED END-TO-END SPEECH RECOGNITION |
1959 | SINGER IDENTIFICATION USING DEEP TIMBRE FEATURE LEARNING WITH KNN-NET |
4075 | SINGING LANGUAGE IDENTIFICATION USING A DEEP PHONOTACTIC APPROACH |
3682 | SINGING MELODY EXTRACTION FROM POLYPHONIC MUSIC BASED ON SPECTRAL CORRELATION MODELING |
3227 | SINGLE CHANNEL VOICE SEPARATION FOR UNKNOWN NUMBER OF SPEAKERS UNDER REVERBERANT AND NOISY SETTINGS |
1597 | SINGLE-POINT ARRAY RESPONSE CONTROL WITH MINIMUM PATTERN DEVIATION |
1897 | SKIP ATTENTION GAN FOR REMOTE SENSING IMAGE SYNTHESIS |
1524 | SLAP: A Split Latency Adaptive VLIW Pipeline Architecture which enables on-the-fly variable SIMD vector-length |
1184 | SLIDING-CAPON BASED CONVOLUTIONAL BEAMSPACE FOR LINEAR ARRAYS |
3912 | SLOW-FAST AUDITORY STREAMS FOR AUDIO RECOGNITION |
3013 | SM+: Refined Scale Match for Tiny Person Detection |
2576 | SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION FOR EMBEDDED SYSTEMS |
4626 | SNR-ADAPTIVE DEEP JOINT SOURCE-CHANNEL CODING FOR WIRELESS IMAGE TRANSMISSION |
3249 | SOCIAL LEARNING UNDER INFERENTIAL ATTACKS |
1741 | SOLVING A CLASS OF NON-CONVEX MIN-MAX GAMES USING ADAPTIVE MOMENTUM METHODS |
3801 | SOUND EVENT DETECTION AND SEPARATION: A BENCHMARK ON DESED SYNTHETIC SOUNDSCAPES |
2952 | SOUND EVENT DETECTION BASED ON CURRICULUM LEARNING CONSIDERING LEARNING DIFFICULTY OF EVENTS |
2286 | SOUND EVENT DETECTION BY CONSISTENCY TRAINING AND PSEUDO-LABELING WITH FEATURE-PYRAMID CONVOLUTIONAL RECURRENT NEURAL NETWORKS |
4099 | SOUND EVENT DETECTION IN URBAN AUDIO WITH SINGLE AND MULTI-RATE PCEN |
2678 | SOUND RECOVERY FROM RADIO SIGNALS |
4023 | SOURCE-AWARE NEURAL SPEECH CODING FOR NOISY SPEECH COMPRESSION |
3396 | SPARSE ARRAY TRANSCEIVER DESIGN FOR ENHANCED ADAPTIVE BEAMFORMING IN MIMO RADAR |
2947 | SPARSE BAYESIAN LEARNING FOR ACOUSTIC SOURCE LOCALIZATION |
4181 | SPARSE FACTORIZATION-BASED DETECTION OF OFF-THE-GRID MOVING TARGETS USING FMCW RADARS |
3020 | SPARSE FLOW ADVERSARIAL MODEL FOR ROBUST IMAGE COMPRESSION |
4448 | SPARSE GRAPH BASED SKETCHING FOR FAST NUMERICAL LINEAR ALGEBRA |
2280 | SPARSE HIGH-ORDER PORTFOLIOS VIA PROXIMAL DCA AND SCA |
1640 | SPARSE PARAMETER ESTIMATION FOR PMCW MIMO RADAR USING FEW-BIT ADCS |
4025 | SPARSE RECOVERY BEAMFORMING AND UPSCALING IN THE RAY SPACE |
3248 | SPARSE REPRESENTATION OF COMPLEX-VALUED FMRI DATA BASED ON HARD THRESHOLDING OF SPATIAL SOURCE PHASE |
3192 | SPARSE TIME-FREQUENCY REPRESENTATION VIA ATOMIC NORM MINIMIZATION |
5416 | SPARSE-CODED DYNAMIC MODE DECOMPOSITION ON GRAPH FOR PREDICTION OF RIVER WATER LEVEL DISTRIBUTION |
2540 | SPARSIFICATION VIA COMPRESSED SENSING FOR AUTOMATIC SPEECH RECOGNITION |
4897 | SPARSITY AND NONNEGATIVITY CONSTRAINED KRYLOV APPROACH FOR DIRECTION OF ARRIVAL ESTIMATION |
5466 | SPARSITY DRIVEN LATENT SPACE SAMPLING FOR GENERATIVE PRIOR BASED COMPRESSIVE SENSING |
3965 | SPARSITY IN MAX-PLUS ALGEBRA AND APPLICATIONS IN MULTIVARIATE CONVEX REGRESSION |
1519 | SPATIAL EQUALIZATION BEFORE RECEPTION: RECONFIGURABLE INTELLIGENT SURFACES FOR MULTI-PATH MITIGATION |
4146 | SPATIOTEMPORAL ATTENTION FOR MULTIVARIATE TIME SERIES PREDICTION AND INTERPRETATION |
3427 | SPEAKER ACTIVITY DRIVEN NEURAL SPEECH EXTRACTION |
3359 | SPEAKER AND DIRECTION INFERRED DUAL-CHANNEL SPEECH SEPARATION |
3122 | Speaker embeddings for diarization of broadcast data in the ALLIES challenge |
3843 | SPEAKER-INDEPENDENT BRAIN ENHANCED SPEECH DENOISING |
4465 | SPEAKING RATE AND TONAL REALIZATION IN MANDARIN CHINESE: WHAT CAN WE LEARN FROM LARGE SPEECH CORPORA? |
3967 | SPECIALIZED EMBEDDING APPROXIMATION FOR EDGE INTELLIGENCE: A CASE STUDY IN URBAN SOUND CLASSIFICATION |
1272 | SPECTRAL DOMAIN CONVOLUTIONAL NEURAL NETWORK |
2736 | Spectral folding and two-channel filter-banks on arbitrary graphs |
1035 | Speech Acoustic Modelling from Raw Phase Spectrum |
1815 | SPEECH BASED DEPRESSION PREDICTION USING ENCODER-WEIGHT-ONLY TRANSFER LEARNING AND A LARGE CORPUS |
4385 | SPEECH BERT EMBEDDING FOR IMPROVING PROSODY IN NEURAL TTS |
1606 | SPEECH DEREVERBERATION USING VARIATIONAL AUTOENCODERS |
4128 | Speech Emotion Recognition based on Listener Adaptive Models |
3927 | SPEECH EMOTION RECOGNITION USING QUATERNION CONVOLUTIONAL NEURAL NETWORKS |
4963 | Speech Emotion Recognition using Semantic Information |
2869 | Speech Emotion Recognition with Multiscale Area Attention and Data Augmentation |
5236 | Speech enhancement aided end-to-end multi-task learning for voice activity detection |
3390 | SPEECH ENHANCEMENT AUTOENCODER WITH HIERARCHICAL LATENT STRUCTURE |
5606 | Speech Enhancement Using Masking for Binaural Reproduction of Ambisonics Signals |
3278 | SPEECH ENHANCEMENT WITH MIXTURE OF DEEP EXPERTS WITH CLEAN CLUSTERING PRE-TRAINING |
2717 | SPEECH PREDICTION IN SILENT VIDEOS USING VARIATIONAL AUTOENCODERS |
2378 | SPEECH RECOGNITION BY SIMPLY FINE-TUNING BERT |
4034 | SPEECH-LANGUAGE PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING |
3353 | SPEEDING UP OF KERNEL-BASED LEARNING FOR HIGH-ORDER TENSORS |
4930 | SPHERICAL HARMONIC REPRESENTATION FOR DYNAMIC SOUND-FIELD MEASUREMENTS |
3705 | SPOKEN LANGUAGE IDENTIFICATION IN UNSEEN TARGET DOMAIN USING WITHIN-SAMPLE SIMILARITY LOSS |
4593 | SQUEEZING VALUE OF CROSS-DOMAIN LABELS: A DECOUPLED SCORING APPROACH FOR SPEAKER VERIFICATION |
2796 | SQWA: STOCHASTIC QUANTIZED WEIGHT AVERAGING FOR IMPROVING THE GENERALIZATION CAPABILITY OF LOW-PRECISION DEEP NEURAL NETWORKS |
1770 | SRF-NET: SELECTIVE RECEPTIVE FIELD NETWORK FOR ANCHOR-FREE TEMPORAL ACTION DETECTION |
1224 | SSFENET: SPATIAL AND SEMANTIC FEATURE ENHANCEMENT NETWORK FOR OBJECT DETECTION |
4584 | SSLIDE: Sound Source Localization for Indoors based on Deep Learning |
2374 | STABILITY ANALYSIS OF THE RC-PLMS ADAPTIVE BEAMFORMER USING A SIMPLE TRANSFER FUNCTION APPROXIMATION |
4125 | STABILITY OF ALGEBRAIC NEURAL NETWORKS TO SMALL PERTURBATIONS |
2825 | STABLE AND EFFECTIVE ONE-STEP METHOD FOR PERSON SEARCH |
3782 | STABLE CHECKPOINT SELECTION AND EVALUATION IN SEQUENCE TO SEQUENCE SPEECH SYNTHESIS |
3285 | STATISTICAL CORRECTION OF TRANSCRIBED MELODY NOTES BASED ON PROBABILISTIC INTEGRATION OF A MUSIC LANGUAGE MODEL AND A TRANSCRIPTION ERROR MODEL |
2900 | STATISTICAL DISTANCE METRIC LEARNING FOR IMAGE SET RETRIEVAL |
2728 | STATISTICAL PROPERTIES OF A MODIFIED WELCH METHOD THAT USES SAMPLE PERCENTILES |
3009 | ST-BERT: CROSS-MODAL LANGUAGE MODEL PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING |
2124 | STEP-GAN: A One-Class Anomaly Detection Model with Applications to Power System Security |
4899 | STEREO RECTIFICATION BASED ON EPIPOLAR CONSTRAINED NEURAL NETWORK |
1176 | Stochastic Deep Unfolding for Imaging Inverse Problems |
2991 | STOCHASTIC SUCCESSIVE WEIGHTED SUM-RATE MAXIMIZATION FOR MULTIUSER MIMO SYSTEMS WITH FINITE-ALPHABET INPUTS |
5421 | STOCK MOVEMENT PREDICTION AND PORTFOLIO MANAGEMENT VIA MULTIMODAL LEARNING WITH TRANSFORMER |
5302 | STREAMING END-TO-END SPEECH RECOGNITION WITH JOINTLY TRAINED NEURAL FEATURE ENHANCEMENT |
2057 | STREAMING MULTI-SPEAKER ASR WITH RNN-T |
4827 | STREAMING SIMULTANEOUS SPEECH TRANSLATION WITH AUGMENTED MEMORY TRANSFORMER |
4941 | Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff |
3022 | STRUCTURE-AWARE AUDIO-TO-SCORE ALIGNMENT USING PROGRESSIVELY DILATED CONVOLUTIONAL NEURAL NETWORKS |
1552 | STRUCTURED SUPPORT EXPLORATION FOR MULTILAYER SPARSE MATRIX FACTORIZATION |
4639 | STRUCTURE-ENHANCED ATTENTIVE LEARNING FOR SPINE SEGMENTATION FROM ULTRASOUND VOLUME PROJECTION IMAGES |
1274 | STYLEMELGAN: AN EFFICIENT HIGH-FIDELITY ADVERSARIAL VOCODER WITH TEMPORAL ADAPTIVE NORMALIZATION |
4519 | SUB-BAND GROUPING SPECTRAL FEATURE-ATTENTION BLOCK FOR HYPERSPECTRAL IMAGE CLASSIFICATION |
5189 | SUBJECT-INVARIANT EEG REPRESENTATION LEARNING FOR EMOTION RECOGNITION |
3955 | Subjective and objective evaluation of deepfake videos |
1266 | SUB-NYQUIST MULTICHANNEL BLIND DECONVOLUTION |
2464 | SUBSPACE ODDITY - OPTIMIZATION ON PRODUCT OF STIEFEL MANIFOLDS FOR EEG DATA |
3394 | SUBSPECTRAL NORMALIZATION FOR NEURAL AUDIO DATA PROCESSING |
1730 | SUPER-RESOLUTION AND INFECTION EDGE DETECTION CO-GUIDED LEARNING FOR COVID-19 CT SEGMENTATION |
3480 | SUPER-RESOLUTION OF PERIODIC SIGNALS FROM SHORT SEQUENCES OF SAMPLES |
2731 | Supervised Chorus Detection for Popular Music Using Convolutional Neural Network and Multi-task Learning |
4763 | Supervised direct-path relative transfer function learning for binaural sound source localization |
4467 | SUREmap: Predicting Uncertainty in CNN-based Image Reconstructions using Stein's Unbiased Risk Estimate |
3674 | SURROGATE SOURCE MODEL LEARNING FOR DETERMINED SOURCE SEPARATION |
4877 | SWITCHED HAWKES PROCESSES |
3517 | Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement |
4389 | Symmetric Sub-graph Spatio-Temporal Graph Convolution and its application in Complex Activity Recognition |
5322 | SYNAUG: SYNTHESIS-BASED DATA AUGMENTATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION |
1287 | SYNCHRONOUS MULTI-BIT AUDIO WATERMARKING BASED ON PHASE SHIFTING |
1799 | SYNERGIC FEATURE ATTENTION FOR IMAGE RESTORATION |
3080 | SYNTACTIC REPRESENTATION LEARNING FOR NEURAL NETWORK BASED TTS WITH SYNTACTIC PARSE TREE TRAVERSAL |
4609 | SYNTHESIS OF NEW WORDS FOR IMPROVED DYSARTHRIC SPEECH RECOGNITION ON AN EXPANDED VOCABULARY |
4321 | SYNTHESIZE & LEARN: JOINTLY OPTIMIZING GENERATIVE AND CLASSIFIER NETWORKS FOR IMPROVED DROWSINESS DETECTION |
4086 | SYNTHETIC APERTURE ACOUSTIC IMAGING WITH DEEP GENERATIVE MODEL BASED SOURCE DISTRIBUTION PRIOR |
1499 | SYNTHETIC DATA FOR DNN-BASED DOA ESTIMATION OF INDOOR SPEECH |
3573 | TABULAR TRANSFORMERS FOR MODELING MULTIVARIATE TIME SERIES |
1446 | TAKING A CLOSER LOOK AT SYNTHESIS: FINE-GRAINED ATTRIBUTE ANALYSIS FOR PERSON RE-IDENTIFICATION |
3752 | TAMING VOTING ALGORITHMS ON GPUS FOR AN EFFICIENT CONNECTED COMPONENT ANALYSIS ALGORITHM |
5527 | TARGET DETECTION FROM DISTRIBUTED PASSIVE SENSORS: SEMI-LABELED DATA QUANTIZATION |
3924 | Target Detection in Frequency Hopping MIMO Dual-Function Radar-Communication Systems |
5355 | Task Aware Multi-Task Learning for Speech to Text Tasks |
4568 | TASK-AWARE NEURAL ARCHITECTURE SEARCH |
5058 | TASK-RELATED SELF-SUPERVISED LEARNING FOR REMOTE SENSING IMAGE CHANGE DETECTION |
2335 | TCLA ARRAY: A NEW SPARSE ARRAY DESIGN WITH LESS MUTUAL COUPLING |
2244 | Teacher-Assisted Mini-Batch Sampling for Blind Distillation using Metric Learning |
3991 | TEACHER-STUDENT LEARNING FOR LOW-LATENCY ONLINE SPEECH ENHANCEMENT USING WAVE-U-NET |
5226 | TEACHER-STUDENT LEARNING WITH MULTI-GRANULARITY CONSTRAINT TOWARDS COMPACT FACIAL FEATURE REPRESENTATION |
2702 | TEMPORAL EXEMPLAR CHANNELS IN HIGH-MULTIPATH ENVIRONMENTS |
1716 | TEMPORAL LINK PREDICTION VIA REINFORCEMENT LEARNING |
2209 | TEMPORAL RAIN DECOMPOSITION WITH SPATIAL STRUCTURE GUIDANCE FOR VIDEO DERAINING |
1187 | TENSOR DECOMPOSITION VIA CORE TENSOR NETWORKS |
3758 | TENSOR REORDERING FOR CNN COMPRESSION |
3333 | Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events |
5008 | THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE 2020: OPEN DATASETS, TRACKS, BASELINES, RESULTS AND METHODS |
5587 | THE AUTOMATIC DETECTION OF SPEECH DISORDER IN CHILDREN: CHALLENGES, OPPORTUNITIES, AND PRELIMINARY RESULTS |
4045 | THE BENEFIT OF TEMPORALLY-STRONG LABELS IN AUDIO EVENT CLASSIFICATION |
2441 | THE FAR-FIELD EQUATORIAL ARRAY FOR BINAURAL RENDERING |
3865 | THE IDLAB VOXSRC-20 SUBMISSION: LARGE MARGIN FINE-TUNING AND QUALITY-AWARE SCORE CALIBRATION IN DNN BASED SPEAKER VERIFICATION |
4341 | The ins and outs of speaker recognition: lessons from VoxSRC 2020 |
3944 | THE IN-THE-WILD SPEECH MEDICAL CORPUS |
3814 | THE ROLE OF TASK AND ACOUSTIC SIMILARITY IN AUDIO TRANSFER LEARNING: INSIGHTS FROM THE SPEECH EMOTION RECOGNITION CASE |
3256 | THE USE OF VOICE SOURCE FEATURES FOR SUNG SPEECH RECOGNITION |
3800 | TIME-DOMAIN CONCENTRATION AND APPROXIMATION OF COMPUTABLE BANDLIMITED SIGNALS |
1723 | TIME-DOMAIN LOSS MODULATION BASED ON OVERLAP RATIO FOR MONAURAL CONVERSATIONAL SPEAKER SEPARATION |
3431 | Time-domain speaker verification using temporal convolutional networks |
3923 | TIME-DOMAIN SPEECH EXTRACTION WITH SPATIAL INFORMATION AND MULTI SPEAKER CONDITIONING MECHANISM |
1555 | Time-varying graph signal inpainting via unrolling networks |
3055 | TINY TRANSDUCER: A HIGHLY-EFFICIENT SPEECH RECOGNITION MODEL ON EDGE DEVICES |
1009 | t-k-means: A ROBUST AND STABLE k-means VARIANT |
4770 | To Supervise or Not To Supervise: How to Effectively Learn Wireless Interference Management Models? |
3806 | TOP-DOWN ATTENTION IN END-TO-END SPOKEN LANGUAGE UNDERSTANDING |
2186 | Topic Sequence Embedding for User Identity Linkage from Heterogeneous Behavior Data |
3597 | TOPIC-AWARE DIALOGUE GENERATION WITH TWO-HOP BASED GRAPH ATTENTION |
5608 | Topological IIR Filters Over Simplicial Topologies via Sheaves |
4505 | TOPOLOGICAL VOLTERRA FILTERS |
2122 | Toward Skills Dialog Orchestration with Online Learning |
2822 | Towards Adversarial Robustness via Compact Feature Representations |
4501 | TOWARDS AN ASR APPROACH USING ACOUSTIC AND LANGUAGE MODELS FOR SPEECH ENHANCEMENT |
2301 | TOWARDS AN INTRINSIC DEFINITION OF ROBUSTNESS FOR A CLASSIFIER |
5002 | TOWARDS DATA SELECTION ON TTS DATA FOR CHILDREN'S SPEECH RECOGNITION |
2221 | TOWARDS EFFICIENT AGE ESTIMATION BY EMBEDDING POTENTIAL GENDER FEATURES |
1630 | Towards efficient models for real-time deep noise suppression |
4927 | Towards Efficiently Diversifying Dialogue Generation via Embedding Augmentation |
5062 | TOWARDS EXPLAINING EXPRESSIVE QUALITIES IN PIANO RECORDINGS: TRANSFER OF EXPLANATORY FEATURES VIA ACOUSTIC DOMAIN ADAPTATION |
4376 | TOWARDS IMMEDIATE BACKCHANNEL GENERATION USING ATTENTION-BASED EARLY PREDICTION MODEL |
1244 | TOWARDS LISTENING TO 10 PEOPLE SIMULTANEOUSLY: AN EFFICIENT PERMUTATION INVARIANT TRAINING OF AUDIO SOURCE SEPARATION USING SINKHORN’S ALGORITHM |
2107 | TOWARDS LOW-RESOURCE STARGAN VOICE CONVERSION USING WEIGHT ADAPTIVE INSTANCE NORMALIZATION |
2243 | TOWARDS NATURAL AND CONTROLLABLE CROSS-LINGUAL VOICE CONVERSION BASED ON NEURAL TTS MODEL AND PHONETIC POSTERIORGRAM |
5294 | TOWARDS PARKINSON’S DISEASE PROGNOSIS USING SELF-SUPERVISED LEARNING AND ANOMALY DETECTION |
1476 | TOWARDS PRACTICAL LIPREADING WITH DISTILLED AND EFFICIENT MODELS |
4137 | TOWARDS PRACTICAL NEAR-MAXIMUM-LIKELIHOOD DECODING OF ERROR-CORRECTING CODES: AN OVERVIEW |
2759 | Towards Robust Speaker Verification with Target Speaker Enhancement |
5325 | TOWARDS ROBUST TRAINING OF MULTI-SENSOR DATA FUSION NETWORK AGAINST ADVERSARIAL EXAMPLES IN SEMANTIC SEGMENTATION |
2597 | TOWARDS THE DEVELOPMENT OF SUBJECT-INDEPENDENT INVERSE METABOLIC MODELS |
4766 | TRAFFIC SPEED FORECASTING VIA SPATIO-TEMPORAL ATTENTIVE GRAPH ISOMORPHISM NETWORK |
3697 | TRAIN YOUR CLASSIFIER FIRST: CASCADE NEURAL NETWORKS TRAINING FROM UPPER LAYERS TO LOWER LAYERS |
1470 | TRAINING A BANK OF WIENER MODELS WITH A NOVEL QUADRATIC MUTUAL INFORMATION COST FUNCTION |
3929 | TRAINING LOGICAL NEURAL NETWORKS BY PRIMAL–DUAL METHODS FOR NEURO-SYMBOLIC REASONING |
2369 | Training Neural Networks with Domain Pattern-Aware Auxiliary Task for Sleep Staging |
4453 | TRAINING NOISY SINGLE-CHANNEL SPEECH SEPARATION WITH NOISY ORACLE SOURCES: A LARGE GAP AND A SMALL STEP |
1870 | TRAINING REAL-TIME PANORAMIC OBJECT DETECTORS WITH VIRTUAL DATASET |
2089 | TRAINING SPEECH RECOGNITION MODELS WITH FEDERATED LEARNING: A QUALITY/COST FRAMEWORK |
2698 | TRANSCRIPTION IS ALL YOU NEED: LEARNING TO SEPARATE MUSICAL MIXTURES WITH SCORE AS SUPERVISION |
4250 | TRANSFER LEARNING FOR INPUT ESTIMATION OF VEHICLE SYSTEMS |
5333 | TRANSFORMER BASED UNSUPERVISED PRE-TRAINING FOR ACOUSTIC REPRESENTATION LEARNING |
4127 | Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications |
3907 | Transformer Language Models with LSTM-based Cross-utterance Information Representation |
1847 | Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention |
2168 | Transformer-Transducers for Code-Switched Speech Recognition |
1619 | TRANSITIVE TRANSFER SPARSE CODING FOR DISTANT DOMAIN |
5080 | TRANSMASK: A COMPACT AND FAST SPEECH SEPARATION MODEL BASED ON TRANSFORMER |
3274 | TRANSMITTANCE REGULARIZER FOR BINARY CODED APERTURE DESIGN IN A COMPUTATIONAL IMAGING END-TO-END APPROACH |
4195 | TREATMENT EFFECT ESTIMATION USING INVARIANT RISK MINIMIZATION |
3623 | TRIPLE SEQUENCE GENERATIVE ADVERSARIAL NETS FOR UNSUPERVISED IMAGE CAPTIONING |
4707 | TSTNN: TWO-STAGE TRANSFORMER BASED NEURAL NETWORK FOR SPEECH ENHANCEMENT IN THE TIME DOMAIN |
3277 | TTS-BY-TTS: TTS-DRIVEN DATA AUGMENTATION FOR FAST AND HIGH-QUALITY SPEECH SYNTHESIS |
3145 | TUCKER DECOMPOSITION FOR EXTRACTING SHARED AND INDIVIDUAL SPATIAL MAPS FROM MULTI-SUBJECT RESTING-STATE FMRI DATA |
2489 | Two-Stage Adaptive Pooling with RT-qPCR for COVID-19 Screening |
2888 | Two-Stage Framework for Seasonal Time Series Forecasting |
4426 | Two-stage Graph-constrained Group Testing: Theory and Application |
4641 | TWO-STAGE TEXTUAL KNOWLEDGE DISTILLATION FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING |
2006 | TYPINGWRISTBAND: A HUMAN SLIGHT MOTION SENSING SYSTEM BASED ON VIBRATION DETECTION |
5099 | U-CONVOLUTION BASED RESIDUAL ECHO SUPPRESSION WITH MULTIPLE ENCODERS |
2457 | ULTRA-LIGHTWEIGHT SPEECH SEPARATION VIA GROUP COMMUNICATION |
4858 | ULTRA-LOW BITRATE VIDEO CONFERENCING USING DEEP IMAGE ANIMATION |
4261 | ULTRASOUND ELASTICITY IMAGING USING PHYSICS-BASED MODELS AND LEARNING-BASED PLUG-AND-PLAY PRIORS |
2131 | UNCERTAINTY-BASED BIOLOGICAL AGE ESTIMATION OF BRAIN MRI SCANS |
3563 | Unfolding Neural Networks for Compressive Multichannel Blind Deconvolution |
3053 | UNIDIRECTIONAL MEMORY-SELF-ATTENTION TRANSDUCER FOR ONLINE SPEECH RECOGNITION |
4328 | Unified Clustering and Outlier Detection on Specialized Hardware |
4757 | UNIFIED GRADIENT REWEIGHTING FOR MODEL BIASING WITH APPLICATIONS TO SOURCE SEPARATION |
4556 | UNIT SELECTION SYNTHESIS BASED DATA AUGMENTATION FOR FIXED PHRASE SPEAKER VERIFICATION |
2386 | UNIVERSAL NEURAL VOCODING WITH PARALLEL WAVENET |
2974 | UNROLLING OF DEEP GRAPH TOTAL VARIATION FOR IMAGE DENOISING |
4219 | UNSUPERVISED AND SEMI-SUPERVISED FEW-SHOT ACOUSTIC EVENT CLASSIFICATION |
5059 | UNSUPERVISED AUDIO-VISUAL SUBSPACE ALIGNMENT FOR HIGH-STAKES DECEPTION DETECTION |
3979 | Unsupervised Clustering of Time Series Signals using Neuromorphic Energy-Efficient Temporal Neural Networks |
1480 | UNSUPERVISED COMMON PARTICULAR OBJECT DISCOVERY AND LOCALIZATION BY ANALYZING A MATCH GRAPH |
4255 | UNSUPERVISED CONTRASTIVE LEARNING OF SOUND EVENT REPRESENTATIONS |
4356 | UNSUPERVISED DISCRIMINATIVE LEARNING OF SOUNDS FOR AUDIO EVENT CLASSIFICATION |
4976 | Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training |
4550 | UNSUPERVISED HEART ABNORMALITY DETECTION BASED ON PHONOCARDIOGRAM ANALYSIS WITH BETA VARIATIONAL AUTO-ENCODERS |
3629 | UNSUPERVISED IMAGE SEGMENTATION WITH SPATIAL TRIPLET MARKOV TREES |
4269 | UNSUPERVISED LEARNING FOR ASYNCHRONOUS RESOURCE ALLOCATION IN AD-HOC WIRELESS NETWORKS |
3222 | UNSUPERVISED LEARNING FOR MULTI-STYLE SPEECH SYNTHESIS WITH LIMITED DATA |
4982 | UNSUPERVISED MOTION REPRESENTATION ENHANCED NETWORK FOR ACTION RECOGNITION |
1243 | Unsupervised Multimodal Image Registration with Adaptative Gradient Guidance |
2705 | UNSUPERVISED MUSICAL TIMBRE TRANSFER FOR NOTIFICATION SOUNDS |
2868 | Unsupervised neural adaptation model based on optimal transport for spoken language identification |
5390 | UNSUPERVISED RECONSTRUCTION OF SEA SURFACE CURRENTS FROM AIS MARITIME TRAFFIC DATA USING LEARNABLE VARIATIONAL MODELS |
4738 | UNSUPERVISED STACKED CAPSULE AUTOENCODER FOR HYPERSPECTRAL IMAGE CLASSIFICATION |
1171 | Unveiling anomalous nodes via random sampling and consensus on graphs |
1819 | UPSAMPLING ARTIFACTS IN NEURAL AUDIO SYNTHESIS |
2136 | USERREG: A SIMPLE BUT STRONG MODEL FOR RATING PREDICTION |
4606 | USING DEEP IMAGE PRIORS TO GENERATE COUNTERFACTUAL EXPLANATIONS |
1461 | USING SYNTHETIC AUDIO TO IMPROVE THE RECOGNITION OF OUT-OF-VOCABULARY WORDS IN END-TO-END ASR SYSTEMS |
4418 | uTDN: An Unsupervised Two-Stream Dirichlet-Net for Hyperspectral Unmixing |
2561 | VALIDATING THE INSPIRED SINEWAVE TECHNIQUE TO MEASURE LUNG HETEROGENEITY COMPARED TO ATELECTASIS & OVER-DISTENDED VOLUME IN COMPUTED TOMOGRAPHY IMAGES |
3908 | VARIANCE-CONSTRAINED LEARNING FOR STOCHASTIC GRAPH NEURAL NETWORKS |
3684 | VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT WITH A NOISE-AWARE ENCODER |
4085 | VARIATIONAL AUTOENCODERS FOR HYPERSPECTRAL UNMIXING WITH ENDMEMBER VARIABILITY |
5600 | VARIATIONAL DENOISING AUTOENCODERS AND LEAST-SQUARES POLICY ITERATION FOR STATISTICAL DIALOGUE MANAGERS |
3501 | VARIATIONAL DIALOGUE GENERATION WITH NORMALIZING FLOWS |
5321 | VARIATIONAL PARAMETER LEARNING IN SEQUENTIAL STATE-SPACE MODEL VIA PARTICLE FILTERING |
2918 | VARIATION-STABLE FUSION FOR PPG-BASED BIOMETRIC SYSTEM |
1939 | VEHICLE 3D LOCALIZATION IN ROAD SCENES VIA A MONOCULAR MOVING CAMERA |
4126 | VGAI: END-TO-END LEARNING OF VISION-BASED DECENTRALIZED CONTROLLERS FOR ROBOT SWARMS |
3807 | VIDEO QUALITY PREDICTION USING VOXEL-WISE fMRI MODELS OF THE VISUAL CORTEX |
3716 | VIOLENCE DETECTION IN VIDEOS BASED ON FUSING VISUAL AND AUDIO INFORMATION |
1011 | VISUAL PRIVACY PROTECTION VIA MAPPING DISTORTION |
5198 | VISUALIZING ASSOCIATION IN EXEMPLAR-BASED CLASSIFICATION |
3642 | VK-Net: Category-level Point Cloud Registration with Unsupervised Rotation Invariant Keypoints |
3362 | VOWEL NON-VOWEL BASED SPECTRAL WARPING AND TIME SCALE MODIFICATION FOR IMPROVEMENT IN CHILDREN’S ASR |
4118 | VSET: A Multimodal Transformer for Visual Speech Enhancement |
3001 | WAKE WORD DETECTION WITH STREAMING TRANSFORMERS |
3207 | WARP-Q: QUALITY PREDICTION FOR GENERATIVE NEURAL SPEECH CODECS |
5480 | WASE: LEARNING WHEN TO ATTEND FOR SPEAKER EXTRACTION IN COCKTAIL PARTY ENVIRONMENTS |
2670 | WASSERSTEIN BARYCENTER TRANSPORT FOR ACOUSTIC ADAPTATION |
5350 | WAVE-DOMAIN OPTIMIZATION OF SECONDARY SOURCE PLACEMENT FREE FROM INFORMATION OF ERROR SENSOR POSITIONS |
5370 | Waveform Design for the Joint MIMO Radar and Communications With Low Integrated Sidelobe Levels and Accurate Information Embedding |
2729 | WAVE-TACOTRON: SPECTROGRAM-FREE END-TO-END TEXT-TO-SPEECH SYNTHESIS |
2479 | Weakly Supervised Patch Label Inference Network with Image Pyramid for Pavement Diseases Recognition in the Wild |
5343 | Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels |
2046 | WEBLY SUPERVISED DEEP ATTENTIVE QUANTIZATION |
3638 | WEIGHT IDENTIFICATION THROUGH GLOBAL OPTIMIZATION IN A NEW HYSTERETIC NEURAL NETWORK MODEL |
3813 | WEIGHTED MAGNITUDE-PHASE LOSS FOR SPEECH DEREVERBERATION |
3225 | Weighted Recursive Least Square Filter and Neural Network based Residual Echo Suppression for the AEC-Challenge |
1310 | WHAT AND WHERE TO FOCUS IN PERSON SEARCH |
3618 | WHAT'S ALL THE FUSS ABOUT FREE UNIVERSAL SOUND SEPARATION DATA? |
5336 | When Face Recognition Meets Occlusion: A New Benchmark |
3530 | WIDE AND DEEP GRAPH NEURAL NETWORKS WITH DISTRIBUTED ONLINE LEARNING |
3338 | WIENER FILTER ON MEET/JOIN LATTICES |
2626 | WIFI-BASED DEVICE-FREE GESTURE RECOGNITION THROUGH-THE-WALL |
1103 | WINDOW BEAMFORMER FOR SPARSE CONCENTRIC CIRCULAR ARRAY |
4468 | Word-Level ASL Recognition and Trigger Sign Detection with RF Sensors |
2459 | YAPA: ACCELERATED PROXIMAL ALGORITHM FOR CONVEX COMPOSITE PROBLEMS |
4035 | ''YOU SHOULD PROBABLY READ THIS'': HEDGE DETECTION IN TEXT |
5439 | ZERO-GRADIENT CONSTRAINTS FOR DESTRIPING OF REMOTE-SENSING DATA |
3766 | ZERO-SHOT AUDIO CLASSIFICATION WITH FACTORED LINEAR AND NONLINEAR ACOUSTIC-SEMANTIC PROJECTIONS |
4849 | ZERO-SHOT VOICE CONVERSION WITH ADJUSTED SPEAKER EMBEDDINGS AND SIMPLE ACOUSTIC FEATURES |