AI’s understanding and reasoning skills can’t be assessed by current tests

3 months ago

Citations

E. Arakelyan, Z. Liu and I. Augenstein. Semantic sensitivities and inconsistent predictions: Measuring nan fragility of NLI models. Proceedings of nan 18th Conference of nan European Chapter of nan Association for Computational Linguistics. March 2024.

N. Alzahrani et al. When benchmarks are targets: Revealing nan sensitivity of ample connection exemplary leaderboards. arxiv:2402.01781. February 1,2024.

N. Dziri et al. Faith and fate: Limits of transformers connected compositionality. Advances successful Neural Information Processing Systems 36. February 2024.

P. West et al. The generative AI paradox: “What it tin create, it whitethorn not understand.” International Conference connected Learning Representations. January 16, 2024.

C. Deng et al. Investigating information contamination successful modern benchmarks for ample connection models. arXiv:2311.09783. November 16, 2023.

R. Burnell et al. Rethink reporting of information results successful AI. Science. Vol. 380, April 14, 2023, p. 136. doi:10.1126/science.adf6369.

E. Davis. Benchmarks for automated commonsense reasoning: A survey. arXiv:2302.04752. February 9, 2023.

Y. Elazar et al. Back to quadrate one: Artifact detection, training and commonsense disentanglement successful nan Winograd Schema. Proceedings of nan 2021 Conference connected Empirical Methods successful Natural Language Processing. November 2021. doi: 10.18653/v1/2021.emnlp-main.819.

D. Hendrycks et al. Measuring Massive Multitask Language Understanding. International Conference connected Learning Representations. January 12, 2021.

P. Trichelair et al. How reasonable are common-sense reasoning tasks: A case-study connected nan Winograd Schema Challenge and SWAG. Proceedings of nan 2019 Conference connected Empirical Methods successful Natural Language Processing and nan 9th International Joint Conference connected Natural Language Processing (EMNLP-IJCNLP). November 2019. doi: 10.18653/v1/D19-1335.

Ananya is simply a freelance subject writer, journalist and translator, pinch a research inheritance successful robotics. She covers each things algorithms, robots, animals, oceans, urban and nan group progressive successful these fields.

Source sciencenews