This paper evaluates the performances of three of the most prominent multisectoral static applied general equilibrium models used to predict the impact of the North American Free Trade Agreement. These models drastically underestimated the impact of NAFTA on North American trade. Furthermore, the models failed to capture much of the relative impacts on different sectors. Ex-post performance evaluations of applied GE models are essential if policymakers are to have confidence in the results produced by these models. Such valuations also help make applied GE analysis a scientific discipline in which there are well-defined puzzles with clear successes and failures for competing theories. Analyzing sectoral trade data indicates the need for a new theoretical mechanism that generates large increases in trade in product categories with little or no previous trade. To capture changes in macroeconomic aggregates, the models need to be able to capture changes in productivity.