VassiliosDiakoloukas

Office: 141Α-20, Science/ECE Building (Λ), 1st Floor · +302821037220 ·vdiakoloukas@isc.tuc.gr

Dr. Vassilios Diakoloukas, received his diploma in Physics from the University of Crete at Heraklion, in 1994. In 1996 he received his Master’s Diploma (MSc) from the school of Electrical Engineering of the University of Durham in England, and in 2000 he obtained his PhD from the school of Electrical and Computer Engineering (ECE) of the Technical University of Crete.

Since December of 2003 he has been a teaching and research associate at the School of Electrical and Computer Engineering of the Technical University of Crete. He has participated in several National and International research. He has been honored from the «IEEE Signal Processing Society» with the «Young author best paper award» in 2000, for one of his journal publications. He is a member of the IEEE Signal Processing Society and the International Speech Communication Association (ISCA). He is the author or co-author of several journal and conference papers and he also holds two patents. His research interests include speech recognition, speech synthesis, signal processing including speech processing, pattern recognition and stochastic modeling, natural language processing, spoken language understanding, language modeling and recognition and statistical dialogue managers.

Experience

Research Experience

Canonical Reform of Speech Recognition Hypothesis: Adaptation in Greek language (Wmatch-Apptek)

Project Manager, Telecommunication Systems Institute, funded by Apptek GmbH

January 2020 - April 2020

Basic Research for Statistical Dialog Managers

PI, Telecommunication Systems Institute, funded by Toshiba Research Europe Limited

2016 - 2018

Linear Dynamical Models for Speech Synthesis

Senior Researcher, Telecommunication Systems Institute, funded by Toshiba Research Europe Limited

2013 - 2015

Human Input that Works in Real Environments (HIWIRE)

Researcher, Telecommunication Systems Institute, European Union FP6-IST project

2004 - 2007

Enhanced Speech traCking of Air traffic controL communications (ESCALE)

PI/Technical Manager, Telecommunication Systems Institute, funded by THALES Research & Technology France, Eurocontrol – CARE INO II programme

October 2005 - December 2005

ΕPΕΤ ΙΙ – Logotypografia

Researcher, Technical University of Crete, funded by the Greek Secretary of Research and Technology (GSRT)

1999 - 2000

SRI-Dialect Adaptation and Speaker/Channel Normalization in a Spoken Language Translator (SLT)

Researcher, Technical University of Crete, funded by SRI International and Telia Research

1996 - 1999

Teaching Experience

School of Electrical and Computer Engineering, Technical University of Crete

Senior Special Teaching Personnel (tenured)

Teaching Courses, tutorials and laboratories:.

Stochastic Modeling and Pattern Recognition
Digital Signal Processing
Signals and Systems
Probabilities and Random Processes
Statistical Signal Processing
Information Theory
Telecommunication Systems
Introduction to Speech Processing

2003 - today

School of Electronic Engineering, Technological Educational Institute of Crete

Adjunct Professor

Teaching Courses, tutorials and laboratories:

Speech and Image Processing
Signals and Systems

2001 - 2004

Work Experience

Dialogos Speech Communications S.A. Greece

Speech and Dialogue Expert/Project Manager/Technical Manager

Selected Projects:

Development of acoustic models for Nuance's Speech Recognition system (Greek, Arabic, Turkish, Catalan, Italian)
Speech recognition system evaluation and optimization
Acoustic model adaptation
Development of language models (rule-based and statistical)

2001 -

Education

Technical University of Crete, GR

PhD in Electrical and Computer Engineering

PhD Thesis: Maximum Likelihood Stochastic Transformations for continuous speech recognition

This thesis presents our efforts to address two major problems in current large vocabulary continuous speech recognition systems. The first problem is the mismatched conditions between the training and testing sets. We particularly focus on the performance degradation due to different speakers and dialects. The second problem is the explicit modeling of the inter-frame correlations in a speaker-independent (SI) system. We attack both of these problems by applying strategies based on the popular family of linear model transformations and we further propose a novel stochastic transformation scheme named Maximum Likelihood Stochastic Transformations (MLST).

Advisor: Prof. Vassilios Digalakis

1996 - 2000

University of Durham, UK

MSc in Engineering

MSc Thesis: Distributed video through telecommunication networks using Fractal image compression techniques.

The research presented in this thesis investigates the use of fractal compression techniques for a real time video distribution system. We initially describe the mathematical concepts and basic terminology of the fractal compression algorithm and examine several schemes for still images, including two novel contributions. The partitioning of the image into sections which resulted in significant reduction of the compression time and the use of the median metric as alternative to the RMS.

The extension of the fractal compression algorithm from still images to image sequences was then examined and we investigated three different schemes to reduce the temporal redundancy of the video compression algorithm. We prove that significant reduction in the execution time of the compression algorithm can be obtained.

Advisors: Prof. Alan Purvis and Dr. Simon Johnson

1994 - 1995

University of Groningen, NL

ERASMUS at the Computing Science Department

Diploma Thesis: Neural Networks in Speech Recognition.

Implementation of a speech recognizer for the numerical digits zero to nine. Mel-Frequency Ceptral Coefficients were extracted from the spoken waveforms and were used as input features in several Neural Network based acoustic models.

Advisors: Dr.Ir. Jos Nijhuis and Prof.Dr.Ir. Lambert Spaanemburg

January 1994 - August 1994

University of Crete, GR

Diploma in Physics

Diploma Thesis: First-Principles calculations of the thermal properties of metals.

We calculated thermal properties of several metals (Li BCC, Li FCC, Ir, Rh, Ta, Pa) using their elastic constants and First-Principles methods. Specifically, we considered and compared three different solid-state physics approaches: (a) Direct summation of the phonon frequencies, (b) Debye approach (c) Debye-Gruneisen approach. The implementation of the algorithms was made in Fortran computer language.

Advisor: Dr. Antonis Andriotis

1989 - 1994

Patents and Publications

Patents

V. Tsiaras, V. Diakoloukas, Y. Stylianou, V. Digalakis, Speech Synthesis using Linear Dynamical Modelling, U.K. Patent #PN817594GB, 1507420.6, Filed April 30th 2015
V. Tsiaras, V. Diakoloukas, Y. Stylianou, V. Digalakis, Speech Synthesis using Linear Dynamical Modelling with Global Variance, U.K. Patent #PN817613GB, 1507422.2, Filed April 30th 2015

Journal Publications

V.Diakoloukas, F. Lygerakis, M. Lagoudakis and M. Kotti, “Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers”, IEEE Signal Processing Letters, pp. 960-964, DOI: 10.1109/LSP.2020.2998361, 2020.
V.Tsiaras, R.Maia, V.Diakoloukas, Y.Stylianou and V.Digalakis, “Global Variance in Speech Synthesis with Linear Dynamical Models”, IEEE Signal Processing Letters, vol.23, no. 8, pp.1057-1061, Aug 2016.
N.Chatzichrisafis, V.Diakoloukas, V.Digalakis and C.Harizakis, “Gaussian Mixture Clustering and Language Adaptation for the Development of a New Language Speech Recognition System”, IEEE Transactions on Audio, Speech and Language Processing, 15(3):928-938, 2007.
C. Boulis, V.Diakoloukas, V. Digalakis, “Maximum Likelihood Stochastic Transformations Adaptation for Medium and Small Data Sets”, Computer Speech and Language, 15(3):257-287, 2001.
V. Diakoloukas and V. Digalakis. “Maximum-Likelihood Stochastic-Transformations Adaptation of Hidden Markov Models”, IEEE Transactions on Speech and Audio Processing, Vol.7, Num.2, pp.177-187, March 1999.

Conference Publications

F. Lygerakis, V. Diakoloukas, M. Lagoudakis and M. Kotti, “Robust Belief State Space Representation for Statistical Dialogue Managers using Deep Autoencoders”, IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019.
M. Kotti, V. Diakoloukas, A. Papangelis, M. Lagoudakis and Y. Stylianou, “A Case Study on the Importance of Belief State Representation for Dialogue Policy Management”, ISCA international conference (INTERSPEECH), 2018.
D.Georgiadou, V.Diakoloukas, V.Tsiaras and V.Digalakis, “Clockwork-RNN based architectures for Slot Filling”, ISCA international conference (INTERSPEECH), 2017.
V.Tsiaras, R.Maia, V.Diakoloukas, Y.Stylianou and V.Digalakis, “Global Variance in Speech Synthesis with Linear Dynamical Models”, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017
V.Tsiaras, R.Maia, V.Diakoloukas, Y.Stylianou and V.Digalakis, “Towards a Linear Dynamical Model based Speech Synthesizer”, ISCA International Conference (INTERSPEECH), 2015.
V.Tsiaras, R.Maia, V.Diakoloukas, Y.Stylianou and V.Digalakis, “Linear Dynamical Models in Speech Synthesis”, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014
D. Oikonomidis, V. Diakoloukas and V. Digalakis, “A sub-optimal Viterbi-like Search for Linear Dynamic Models Classification”, ISCA International Conference (INTERSPEECH), 2007.
G. Tsontzos, V. Diakoloukas, Ch. Koniaris and V. Digalakis, “Estimation of General Identifiable Linear Dynamic Models with an Application in Speech Recognition”, Proceedings of the the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007.
N.Chatzichrisafis, V.Digalakis, V.Diakoloukas and C.Harizakis, “Rapid Acoustic Model Development using Gaussian Mixture Clustering and Language Adaptation”, International Conference on Speech and Language Processing (INTERSPEECH - ICSLP), October 2004.
V. Digalakis, D. Oikonomidis, D.Pratsolis, N.Tsourakis, C.Vosnidis, N.Chatzichrisafis and V.Diakoloukas, “Large Vocabulary Continuous Speech Recognition in Greek: Corpus and an Automatic Dictation System”, European Conference on Speech Communication and Technology (EUROSPEECH), September 2003.
V. Diakoloukas, V. Digalakis, L. Neumeyer and J. Kaja. “Development of Dialect-Specific Speech Recognizers Using Adaptation Methods”, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 1997.
V. Diakoloukas and V. Digalakis. “Adaptation of Hidden Markov Models Using Multiple Stochastic Transformations”, European Conference on Speech Communication and Technology (EUROSPEECH), September 1997.

Interests

Research Interests

Acoustic modeling and adaptation
Languange modeling
Speech and speaker recognition
Speech synthesis
Digital Signal Processing (speech, image, video)
Dialog systems
Natural Language processing and Understanding
Machine Learning and pattern recognition
Voice-enabled applications
Multimodal user interfaces

Other Interests

Running, Biking and other sports
Music
Reading
Technology
Photography

Awards & Certifications

IEEE Signal Processing Society

“Young Author Best Paper Award in 2000”, for the paper: "Maximum-Likelihood Stochastic-Transformation Adaptation of Hidden Markov Models" ICASSP 2001, Salt Lake City, Utah, USA.