Real-time measurement and feedback control of key plasma parameters are critical for future fusion reactor operation, with ion temperature being a vital control target as part of the triple product for fusion ignition. However, plasma diagnostics often require complex data analysis. A widely used method of obtaining ion temperature $ {T}_{{\mathrm{i}}} $ from charge exchange recombination spectroscopy (CXRS) is iterative spectral fitting, which is time-consuming and requires expert intervention during data analysis. Therefore, the traditional method cannot meet the demand for real-time $ {T}_{{\mathrm{i}}} $ measurement. Neural network (NN), which can learn the underlying relationships between the measured spectra and $ {T}_{{\mathrm{i}}} $, is a promising approach to cope with this problem. In fact, NN approach has been widely adopted in the field of magnetically confined plasma. Previous study in JET has achieved a satisfactory accuracy for inferring $ {T}_{{\mathrm{i}}} $ from CXRS spectra compared with the traditional fitting results. Recently, the study of disruption prediction has achieved great progress with the help of deep NNs. However, these researches are conducted on steadily-operating devices, where for NN models, the data distribution of training set is similar to that of test set. This is not the case for newly-built tokamaks like HL-3, nor for future fusion reactors such as ITER. For new devices, there will be a period for the plasma parameters to rise from low to high ranges. In this case, investigating the extrapolation capability of NN models based on low parameter training data is of paramount importance. A convolutional neural network (CNN)-based model is proposed to accelerate the analysis of spectral data of CXRS, with a focus on investigating the model’s extrapolation capability in a much higher $ {T}_{{\mathrm{i}}} $ range. The dataset contains about 122000 spectral data, as well as their corresponding $ {T}_{{\mathrm{i}}} $ inferred from offline iterative process. The results demonstrate that the CNN-based model achieves excellent analysis of $ {T}_{{\mathrm{i}}} $ as indicated by a coefficient of determination (R²) of 0.92, and reduces the inference time for analyzing a single spectrum to less than 1 ms, reaching 100–1000 times faster than traditional spectral fitting methods. However, the performance of the data-driven neural network model is limited by challenges such as insufficient data and imbalanced data distribution, which further deteriorates the extrapolation capability. Generally, data with higher $ {T}_{{\mathrm{i}}} $ account for a small portion of the total dataset. In our study, only about 5% of the spectra correspond to $ {T}_{{\mathrm{i}}} > 2{\mathrm{ }}\;{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $ (ranging from 2 to 4 keV). However, they reflect the temperature of central plasma, which is more important for assessing the performance of plasma. To overcome this limitation, this study synthesizes high-temperature data based on experimental data from discharges with $ {T}_{{\mathrm{i}}} $ in low-temperature range. By incorporating 5% synthetic data into the training set only consisting of data with $ {T}_{{\mathrm{i}}} < 2\;{\mathrm{ }}{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $, the model’s extrapolation capability is extended to cover the whole range of $ {T}_{{\mathrm{i}}} < 4\;{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $. The mean relative error (MRE) of the model in the range of $ {3\;{\mathrm{ }}{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} < T}_{{\mathrm{i}}} < 4\;{\mathrm{k}}{\mathrm{e}}{\mathrm{V}} $ is reduced from 35% to below 15%, corresponding to a reduction of approximately 60% relative to the MRE before adding synthetic data. This approach demonstrates the feasibility of using synthetic data to enhance the performance of artificial intelligence algorithms in the field of magnetic confinement fusion. These findings provide valuable insights for the development of real-time ion temperature measurement and feedback control for future high-parameter fusion devices. Furthermore, the study lays a foundation for research in areas that require high-performance cross-device characteristic, such as machine learning-based disruption prediction and tearing mode control.