Analysis of variance (ANOVA)
Abstract
Univariate ANOVA is reviewed from a user point-of-view with emphasis on understanding the model building and the assumptions underlying the method. Illustrative examples are taken from organic chemistry and analytical chemistry. The use of graphical techniques to visualize the ANOVA model as well as to analyse residuals is recommended. The main models of ANOVA are developed in some detail including one-factor ANOVA, crossed designs, nested designs, repeated measures ANOVA and variance components estimation. Hypothesis testing by F-tests-and follow up by pairwise comparison methods is shown. The distinction between random effects and fixed effects is explained. Methods to handle non-linearities by transformations or by using response surface methodology are mentioned. Throughout the paper the importance of experimental design is emphasized. References are given to ANOVA methods for more complicated models.
References (14)
- G.E.P. Box et al.(1978)
- G.W. Snedecor et al.(1967)
- R.R. Sokal et al.(1981)
- T.H. Wonnacott et al.(1978)
- R.J. Larsen et al.(1986)
- B.W. Lindgren(1976)
- N. Draper et al.(1981)
Cited by (487)
Knowledge-enhanced model with dual-graph interaction for confusing legal charge prediction
2024, Expert Systems with ApplicationsThe rapid development of natural language processing (NLP) technologies has enabled the emergence of legal intelligence assistance systems, with legal charge prediction (LCP) being a critical technology. The automatic LCP aims to determine the final charges based on fact descriptions of criminal cases. LCP assists human judges in managing workloads and improving efficiency, provides accessible legal guidance for individuals, and supports enterprises in litigation financing and compliance monitoring. However, distinguishing between confusing charges in real-world judicial practice remains a significant challenge. Most exist works cannot effectively capture complex relationships and discern subtle differences in fact descriptions while ignoring the legal schematic knowledge. In order to improve confusing LCP performance, we propose a novel knowledge-aware model for legal charge prediction that leverages Graph Neural Networks (GNNs) to capture complex relationships within criminal case descriptions. Specifically, the model constructs structural and semantic graphs from fact descriptions and integrates information from both through a dual-graph interaction process. A legal knowledge transformer generates key knowledge representations at schema and charge levels, while a knowledge matching network incorporates hierarchical charge knowledge into facts. Besides, we also propose two real-world datasets namely Criminal-All and Criminal-Confusing, containing 203 different charges and 86 confusing charges, respectively. To the best of our knowledge, these datasets are the first well-organized datasets for confusing LCP task. Experimental results demonstrate that the proposed model outperforms baselines and significantly improves the distinction of confusing charges, providing valuable support for intelligent legal judgment systems.
Difficulty-controllable question generation over knowledge graphs: A counterfactual reasoning approach
2024, Information Processing and ManagementDifficulty-controllable question generation (DCQG) over knowledge graphs aims to generate questions with a given subgraph and a difficulty label, such as “easy” or “hard.” However, three significant challenges currently confront DCQG: (1) limited modes for modeling difficulty, (2) the inability to ensure causality between difficulty labels and generated outcomes, and (3) lack of difficulty-annotated datasets. To overcome these challenges, we present , a DCQG model that uses soft templates and counterfactual reasoning. utilizes a mixture of experts as soft template selectors to enhance the diversity of difficulty representation. Soft templates can efficiently capture the similarity among questions of different difficulties, avoiding the need for constructing explicit templates. A disentanglement module is introduced to separate triple representations in the input subgraph that are pertinent and extraneous to the current question’s difficulty. Disentanglement minimizes the interference of irrelevant information on the generated output in neural networks due to entanglement. More importantly, disentangled representations enable the model to create training samples for counterfactual reasoning, strengthening causality between inputs and outputs. Additionally, we propose a question difficulty estimation method that simultaneously considers the input subgraph, question, and answering process. Extensive experiments reveal that our model can successfully generate questions at desired difficulty levels, surpassing the baselines by at least 8% in terms of difficulty control. Furthermore, exhibits superior generalizability and interpretability.
Application of minimum quantity GnP nanofluid and cryogenic LN<inf>2</inf> in the machining of Hastelloy C276
2024, Tribology InternationalThe current study delves into the combined influence of lubrication and cooling on the machinability of a difficult-to-cut superalloy. A comparative study of four lubricating mediums—dry, Minimum Quantity Lubrication (MQL), nanoMQL (NMQL), and Cryo-NMQL—was conducted during the milling of Hastelloy C276. Findings reveal that the Cryo-NMQL medium resulted in a 25.49%, 29.84%, and 42.50% decrease in cutting force, cutting temperature, and surface roughness, respectively, compared to dry cutting. This lubricating medium also decreased tool wear by 44.55%, as confirmed by SEM images showing reduced adhesion and abrasion. Analysis of chip morphology indicated a finer lamella structure with minimal serration under Cryo-NMQL, indicating enhanced efficiency. Furthermore, Cryo-NMQL refined the grain structure, enhancing microhardness for improved superalloy machining. ANOVA analysis identified feed rate and cutting speed as the most impactful parameters affecting the machining responses. Lastly, based on MORSM analysis, optimal machining parameters were determined as a cutting speed of 76.60 m/min, feed rate of 0.12 mm/tooth, radial depth of cut of 6.7 mm, and axial depth of cut of 0.6 mm. Experimental and predicted values closely matched, with a composite desirability score of 0.912, showcasing the efficacy of MORSM in addressing machining difficulties. This holistic investigation underscores Cryo-NMQL as a promising solution for the efficient and effective machining of superalloys.
An adaptive cyclical learning rate based hybrid model for Dravidian fake news detection
2024, Expert Systems with ApplicationsFake news has evolved into a pervasive issue in the era of information overload, influencing public opinion and challenging the credibility of news sources. While various approaches have been proposed to combat fake news, most existing research focuses on high-resource languages, leaving low-resource languages vulnerable to misinformation. In this study, we propose a hybrid deep learning model architecture that integrates dilated temporal convolutional neural networks (DTCN), bidirectional long-short-term memory (BiLSTM), and a contextualized attention mechanism (CAM) to address the problem of detecting fake news in low-resourced Dravidian languages. DTCN is employed to capture temporal dependencies due to its sequential nature, BiLSTM is employed to seize long-range dependencies efficiently, and CAM is used to emphasize important information while downplaying irrelevant content. Additionally, we incorporate an adaptive-based cyclical learning rate with an early stopping mechanism to enhance model convergence. The results demonstrate that the proposed model surpasses the state-of-the-art and baseline models and achieves a higher average accuracy of 93.97% on the Dravidian_Fake dataset in four Dravidian languages.
CNN-LSTM and transfer learning models for malware classification based on opcodes and API calls
2024, Knowledge-Based SystemsIn this paper, we propose a novel model for a malware classification system based on Application Programming Interface (API) calls and opcodes, to improve classification accuracy. This system uses a novel design of combined Convolutional Neural Network and Long Short-Term Memory. We extract opcode sequences and API Calls from Windows malware samples for classification. We transform these features into N-grams (N = 2, 3, and 10)-gram sequences. Our experiments on a dataset of 9,749,57 samples produce high accuracy of 99.91% using the 8-gram sequences. Our method significantly improves the malware classification performance when using a wide range of recent deep learning architectures, leading to state-of-the-art performance. In particular, we experiment with ConvNeXt-T, ConvNeXt-S, RegNetY-4GF, RegNetY-8GF, RegNetY-12GF, EfficientNetV2, Sequencer2D-L, Swin-T, ViT-G/14, ViT-Ti, ViT-S, VIT-B, VIT-L, and MaxViT-B. Among these architectures, Swin-T and Sequencer2D-L architectures achieved high accuracies of 99.82% and 99.70%, respectively, comparable to our CNN-LSTM architecture although not surpassing it.
A criminal macrocause classification model: An enhancement for violent crime analysis considering an unbalanced dataset
2024, Expert Systems with ApplicationsThis study introduces a novel model designed to classify macrocauses of violent crimes. The model’s practical application is demonstrated through its integration into the framework of the Natal Smart City Initiative in Brazil. Utilizing the Design Science methodology, the study details the model’s development, its subsequent implementation through a machine learning pipeline, and its assessment employing four prominent classification techniques: Decision Trees, Logistic Regression, Random Forest, and XGBoost. XGBoost performed exceptionally well, achieving an average accuracy of 0.961791, an F1-Score of 0.961410, and an AUC of ROC curve of 0.994732. Accurate classification of criminal macrocauses is crucial for developing effective public safety policies. The proposed model can provide public safety institutions and criminal analysts with a valuable tool for better understanding aspects related to violent crime analysis in their cities. This can streamline the analysis and management process and provide more accurate information for decision-making. This study also has important implications for the emerging field of smart cities. By providing a tool to assist in decision-making and planning public safety strategies, this work contributes to the development of innovative, data-based, and theory-based solutions to address urban challenges.