跳到主要內容區塊
:::

【研討會】 2025-12-22 09:00-17:10 國立成功大學高等教育 STEM 計畫-最佳化、統計與機器學習研討會

國立成功大學高等教育 STEM 計畫   NCKU STEM Program for Higher Education

最佳化、統計與機器學習研討會   Seminar on Optimization, Statistics and Machine Learning

Tainan, Taiwan, Dec. 22, 2025

主辦單位國立成功大學通識教育中心、理學院數學跨領域研究中心 
Sponsor: NCKU Center for General Education; Interdisciplinary Research Center on Mathematics (College of Science, NCKU)

 

9:00 – 10:40, Dec. 22 Monday, Venue: Room 27103, Center of General Education

Speaker: 王俊焜 Jun-Kun Wang (Assistant Professor, Department of Electrical and Computer Engineering , UC San Diego)

Talk Title: Statistics Meets Optimization (I)(II) (Lecture in Chinese)

Abstract: In this talk, I will focus on a literature overview of "Statistics Meets Optimization" from my personally biased perspective, while also covering my recent research at this intersection at a high level.

In the first part of the talk (I), I will show how the old idea of "control variate" is used to design optimization algorithms like the Stochastic Variance-Reduced Gradient method (SVRG) and the recently popular technique of Prediction Powered Inference (PPI) in statistical inference. SVRG has become a classical algorithm for reducing variance in stochastic optimization. PPI, on the other hand, concerns how to use machine learning predictions on data that mostly lack ground-truth labels to improve statistical inference such as the task of constructing a confidence interval. I will show how the same algorithmic principle has been used in these seemingly different areas. After that, I will share my recent effort in this direction.

Then, I will switch gears to the second part of the talk (II). I will first review a neat result in the literature that casts non-parametric sequential hypothesis testing as an online convex optimization problem, where an online learner tries to bet whether the null hypothesis is true or false, and a tighter regret bound suggests a faster stopping time to reject the null when the alternative is true. Then, I will show how relevant techniques can be used to design algorithms with strong statistical guarantees in online detection of LLM-generated texts. After that, I will introduce a new algorithm that overcomes the limitations of a commonly-used method for sequential hypothesis testing by betting, which potentially leads to a faster rejection time under the alternative while controlling the false positive rate.

 

TEA BREAK
10:40 – 11:10, Dec. 22 Monday, Venue: Concourse of Center of General Education

 

11:10 – 12:00, Dec. 22 Monday, Venue: Room 27103, Center of General Education

馮若梅 (Joe-Mei, Feng, Assistant Professor, Department of Computer Science and Information Engineering, Tamkang University)

Title: Making Autonomous Driving Smarter with Data Selection (Lecture in Chinese)

Abstract: In autonomous driving object detection, expanding data diversity while adapting models to real-world environments is crucial. Online training on every new sample is costly and impractical. We propose a data selection strategy using distance metrics and clustering. New data far from existing categories are added to training with weights inversely proportional to their class size, while overrepresented classes are ignored. Stratified sampling mixes old and new data to reduce bias. This approach improves model adaptability, efficiency, and continual learning performance in autonomous driving scenarios.

 

Lunch
12:10 – 13:20, Dec. 22 Monday, Venue: Concourse of Center of General Education

 

13:20 – 14:10, Dec. 22 Monday, Venue: Room 27103, Center of General Education

王敏齊 (Min-Chi Wang, PhD Candidate, Institute of Applied Mathematics, National Cheng Kung University) (Lecture in English)

Talk title: Implementable Unsolvability Criteria for Systems of Two Quadratic (In)equalities (Lecture in English)

Talk abstract:

Let f(x) and g(x) be two real quadratic functions defined on ⁿ. The decision problem of whether there exists a common solution to the systems of (in)equalities [f(x) = 0, g(x) = 0] (Calabi Theorem) and [f(x) = 0, g(x) ≤ 0] (Strict Finsler Lemma), respectively, has rarely been studied in the literature. In this talk, we equip non-homogeneous variants of the Calabi Theorem as well as the strict Finsler Lemma with necessary and sufficient conditions. To implement these criteria, we reformulate key conditions using matrix pencils and provide an efficient procedure for computing, enabling fast numerical feasibility tests. Finally, we benchmark our method against existing approaches, demonstrating clear gains in both computational speed and reliability.

 

14:10 – 15:00, Dec. 22 Monday, Venue: Room 27103, Center of General Education

朱雅琪 (Ya-Chi Chu, PhD candidate in Mathematics, Stanford University)

Title: Gradient methods with online scaling (Lecture in English)

Abstract: Matrix stepsizes, commonly referred to as preconditioners, play a crucial role to accelerate modern first-order optimization methods by adapting to optimization landscapes with highly heterogeneous curvature across dimensions. It remains challenging to determine optimal matrix stepsizes in practice, which typically requires problem-specific expertise or computationally expensive pre-processing. In this talk, we present a novel family of algorithms, online scaled gradient methods (OSGM), that employs online learning to learn the matrix stepsizes and provably accelerate first-order methods. Convergence rate of OSGM is asymptotically no worse than the rate achieved by the optimal matrix stepsize. On smooth convex problems, OSGM provides a new trajectory-dependent global convergence guarantees; on strongly convex problems, OSGM constitutes a new family of first-order methods with nonasymptotic superlinear convergence, joining the celebrated quasi-Newton methods. Our experiments show that OSGM substantially outperforms existing adaptive first-order methods and frequently matches the performance of L-BFGS, an efficient quasi-Newton method, while utilizing less memory and requiring less computational effort per iteration.

 

TEA BREAK
15:00 – 15:30, Dec. 22 Monday, Venue: Concourse of Center of General Education

 

15:30 – 16:20, Dec. 22 Monday, Venue: Room 27103, Center of General Education

王俊焜 Jun-Kun Wang (Assistant Professor, Department of Electrical and Computer Engineering , UC San Diego)

Talk title: Interplay between optimization and no-regret learning (Lecture in English)

Talk abstract: In this talk, I will show how to design and analyze first-order convex optimization algorithms by playing a two-player zero-sum game. In particular, I will highlight a strong connection between online learning (a.k.a. no-regret learning) and optimization. It turns out that several classical optimization updates can be generated from the game dynamics by pitting pairs of online learners against each other. These include Nesterov's methods, accelerated proximal method (Bech and Teboulle 2009), Frank-Wolfe Method, and more. This fresh perspective of "optimization as a game" also gives rise to new accelerated Frank–Wolfe methods over certain types of constraint sets. Furthermore, by summing the weighted average regrets of both players in the game, one obtains the convergence rate of the resulting optimization algorithm, which provides a simple and modular technique to analyze these optimization methods.

 

16:20 – 17:10, Dec. 22 Monday, Venue: Room 27103, Center of General Education

Rahul Parhi  (Assistant Professor, Department of Electrical and Computer Engineering , UC San Diego)

Seminar Title: Do Neural Networks Generalize Well? Low Norm Solutions vs. Flat Minima (Lecture in English)

Abstract: This talk investigates the fundamental differences between low-norm and flat solutions of shallow ReLU networks training problems, particularly in high-dimensional settings. We show that global minima with small weight norms exhibit strong generalization guarantees that are dimension-independent. In contrast, local minima that are “flat” can generalize poorly as the input dimension increases. We attribute this gap to a phenomenon we call neural shattering, where neurons specialize to extremely sparse input regions, resulting in activations that are nearly disjoint across data points. This forces the network to rely on large weight magnitudes, leading to poor generalization. Our theoretical analysis establishes an exponential separation between flat and low-norm minima. In particular, while flatness does imply some degree of generalization, we show that the corresponding convergence rates necessarily deteriorate exponentially with input dimension. These findings suggest that flatness alone does not fully explain the generalization performance of neural networks.

Bio: Rahul Parhi is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of California, San Diego. Prior to joining UCSD, he was a Postdoctoral Researcher at the École Polytechnique Fédérale de Lausanne (EPFL), where he worked from 2022 to 2024. He completed his PhD in Electrical Engineering at the University of Wisconsin-Madison in 2022. His research interests lie at the interface between functional and harmonic analysis and data science.

瀏覽數: