|
143 | 143 | }
|
144 | 144 |
|
145 | 145 | \newglossaryentry{vfl}
|
146 |
| -{name={vertical federated learning (VFL)},description= |
147 |
| - {VFL\index{vertical federated learning (VFL)} uses \gls{localdataset}s that are constituted |
148 |
| - by the same \gls{datapoint}s but characterizing them with different \gls{feature}s \cite{VFLChapter}. |
149 |
| - For example, different healthcare providers might all contain information |
150 |
| - about the same population of patients. However, different healthcare providers |
151 |
| - collect different measurements (e.g., blood values, electrocardiography, lung X-ray) |
152 |
| - for the same patients.}, |
| 146 | +{name={vertical federated learning (VFL)}, |
| 147 | + description={ |
| 148 | + VFL\index{vertical federated learning (VFL)} refer to \gls{fl} applications where |
| 149 | + \gls{device}s have access to different \gls{feature}s of the same set of \gls{datapoint}s \cite{VFLChapter}. |
| 150 | + Formally, the underlying global \gls{dataset} is |
| 151 | + \[ |
| 152 | + \dataset^{(\mathrm{global})} \defeq \left\{ \left(\featurevec^{(1)}, \truelabel^{(1)}\right), \ldots, \left(\featurevec^{(\samplesize)}, \truelabel^{(\samplesize)}\right) \right\}. |
| 153 | + \] |
| 154 | + We denote by $\featurevec^{(\sampleidx)} = \big( \feature^{(\sampleidx)}_{1}, \ldots, \feature^{(\sampleidx)}_{\nrfeatures'} \big)^{T}$, for $\sampleidx=1,\ldots,\samplesize$, |
| 155 | + the complete \gls{featurevec}s for the \gls{datapoint}s. Each \gls{device} $\nodeidx \in \nodes$ |
| 156 | + observes only a subset $\mathcal{F}^{(\nodeidx)} \subseteq \{1,\ldots,\nrfeatures'\}$ of \gls{feature}s, resulting |
| 157 | + in a \gls{localdataset} $\localdataset{\nodeidx}$ with \gls{featurevec}s |
| 158 | + \[ |
| 159 | + \featurevec^{(\nodeidx,\sampleidx)} = \big( \feature^{(\sampleidx)}_{\featureidx_{1}}, \ldots, \feature^{(\sampleidx)}_{\featureidx_{\nrfeatures}} \big)^{T}. |
| 160 | + \] |
| 161 | + Some of the \gls{device}s might also have access to the \gls{label}s $\truelabel^{(\sampleidx)}$, for $\sampleidx=1,\ldots,\samplesize$, |
| 162 | + of the global \gls{dataset}. One potential application of \gls{vfl} is to enable collaboration between |
| 163 | + different healthcare providers. Each provider collects distinct types of measurements—such as blood |
| 164 | + values, electrocardiography, and lung X-rays—for the same patients. Another application is a |
| 165 | + national social insurance system, where health records, financial indicators, consumer behaviour, |
| 166 | + and mobility \gls{data} are collected by different institutions. \gls{vfl} enables joint learning across |
| 167 | + these parties while allowing well-defined levels of \gls{privprot}. |
| 168 | + \begin{figure}[htbp] |
| 169 | + \begin{center} |
| 170 | + \begin{tikzpicture}[every node/.style={anchor=base}] |
| 171 | + % --- Coordinate definitions --- |
| 172 | + \def\colX{0} |
| 173 | + \def\colY{1.6} |
| 174 | + \def\colZ{3.2} |
| 175 | + \def\colD{4.8} |
| 176 | + \def\colLabel{6.4} |
| 177 | + \def\rowOne{0} |
| 178 | + \def\rowTwo{-1.2} |
| 179 | + \def\rowThree{-2.4} |
| 180 | + \def\rowFour{-3.6} |
| 181 | + % Manually place matrix entries |
| 182 | + \foreach \i/\label in {1/1, 2/2, 4/\samplesize} { |
| 183 | + \pgfmathsetmacro{\y}{-1.2*(\i-1)} |
| 184 | + \node (x\i1) at (0,\y) {$x^{(\label)}_{1}$}; |
| 185 | + \node (x\i2) at (1.6,\y) {$x^{(\label)}_{2}$}; |
| 186 | + \node (dots\i) at (3.2,\y) {$\cdots$}; |
| 187 | + \node (x\i3) at (4.8,\y) {$x^{(\label)}_{\dimlocalmodel}$}; |
| 188 | + \node (y\i) at (6.4,\y) {$\truelabel^{(\label)}$}; |
| 189 | + } |
| 190 | + % Outer rectangle for the full dataset |
| 191 | + \draw[dashed, rounded corners, thick] |
| 192 | + (-0.6,0.6) rectangle (6.9,-4.2); |
| 193 | + \node at (3.1,0.9) {$\dataset^{(\mathrm{global})} $}; |
| 194 | + % Rectangle for local dataset 1 (e.g., first two features) |
| 195 | + \draw[dashed, rounded corners, thick] |
| 196 | + (-0.9,0.9) rectangle (2.1,-4.0); |
| 197 | + \node at (0.25,1.0) {$\localdataset{1}$}; |
| 198 | + % --- Local dataset k (columns 2–3, rows 1–3) --- |
| 199 | + \draw[dashed, rounded corners, thick] |
| 200 | + ($( \colZ + 1,,0.9 )$) rectangle |
| 201 | + ($( \colLabel + 0.4, -4.5)$); |
| 202 | + \node at ($( \colZ + 0.9,-5 )$) {$\localdataset{\nodeidx}$}; |
| 203 | + \end{tikzpicture} |
| 204 | + \end{center} |
| 205 | + \caption{VFL uses \gls{localdataset}s that are derived from the \gls{datapoint}s of a common global \gls{dataset}. |
| 206 | + The \gls{localdataset}s differ in the choice of \gls{feature}s used to characterize the \gls{datapoint}s.\label{fig_vertical_FL}} |
| 207 | + \end{figure}}, |
153 | 208 | first={vertical federated learning (VFL)},text={VFL}
|
154 | 209 | }
|
155 | 210 |
|
|
0 commit comments