-
Notifications
You must be signed in to change notification settings - Fork 100
Description
Bug: File format validation occurs after encoding when using auto force_encoding
Description
When force_encoding: :auto is configured, the file encoding process runs BEFORE file format validation. This causes the system to attempt encoding operations on invalid file formats (e.g., .txt files), leading to unnecessary processing overhead and incorrect error handling flow.
Reproduction Script
require 'bundler/inline'
gemfile(true) do
source 'https://rubygems.org'
gem 'rails', '~> 7.0.0'
gem 'sqlite3', '~> 1.6.0'
gem 'activeadmin', '~> 2.9'
gem 'active_admin_import', '5.1.0'
gem 'rchardet', '~> 1.8'
gem 'rubyzip', '~> 2.3'
gem 'activerecord-import', '~> 1.5'
end
require 'active_record'
require 'action_controller/railtie'
require 'active_admin'
require 'active_admin_import'
require 'tempfile'
ActiveRecord::Base.establish_connection(adapter: 'sqlite3', database: ':memory:')
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Schema.define do
create_table :authors, force: true do |t|
t.string :name
t.string :last_name
t.timestamps
end
end
class Author < ActiveRecord::Base
validates_presence_of :name
end
# Mock ActionDispatch for testing
module ActionDispatch
module Http
class UploadedFile
end
end
end
puts "\n=== Testing Validation Order Issue ===\n"
# Create a test file with .txt extension (invalid format)
invalid_file = Tempfile.new(['test', '.txt'])
invalid_file.write("Name,Last name,Birthday\nJohn,Doe,1986-05-01")
invalid_file.rewind
# Mock uploaded file with text/plain content type
uploaded_file = Struct.new(:present?, :path, :tempfile, :content_type) do
def is_a?(klass)
klass == ActionDispatch::Http::UploadedFile
end
end.new(true, invalid_file.path, invalid_file, 'text/plain')
# Create model with auto encoding detection
model = ActiveAdminImport::Model.new(force_encoding: :auto, file: uploaded_file)
puts "Force encoding enabled: #{model.force_encoding?}"
puts "File format: #{uploaded_file.content_type}"
puts "Expected: Should reject .txt file BEFORE attempting encoding\n\n"
# The bug: encode_file callback runs BEFORE format validation
# In the old code:
# before_validation :encode_file
# validate :correct_content_type
#
# This means encode_file processes the .txt file before we check if it's valid CSV!
# Track if encode_file was called
encode_called = false
model.define_singleton_method(:encode_file) do
encode_called = true
puts "❌ BUG: encode_file was called BEFORE validation!"
super()
end
model.valid?
puts "\nModel valid: #{model.valid?}"
puts "Errors: #{model.errors.full_messages}"
puts "encode_file was called: #{encode_called}"
invalid_file.close
invalid_file.unlink
puts "\n✅ EXPECTED: Validation should reject the file BEFORE encode_file is called"Steps to Reproduce (in real application)
- Configure import with auto encoding detection:
ActiveAdmin.register Author do active_admin_import force_encoding: :auto end
- Visit
/admin/authors/import - Upload a
.txtfile (or any non-CSV format like.doc,.pdf, etc.) - Observe that the system attempts to encode the file before validation
Expected Behavior
The system should reject the file with error: "You can import only valid csv file" BEFORE attempting any encoding operations.
The validation lifecycle should be:
- ✅ Validate file format (
correct_content_type) - ✅ If valid, then encode file (
encode_file)
Actual Behavior
The encode_file method runs BEFORE format validation completes, causing:
- Unnecessary processing overhead: Invalid files are processed unnecessarily
- Potential encoding errors: Non-CSV files may cause encoding exceptions
- Incorrect error handling flow: Encoding happens before we know the file is valid
- Performance degradation: Large invalid files are read and processed
The incorrect lifecycle with the bug:
- ❌ Encode file (
encode_fileruns inbefore_validation) - ❌ Validate file format (
correct_content_type)
Root Cause
In model.rb:39, the callback order was incorrect:
before_validation :encode_file, if: ->(me) { me.force_encoding? && me.file.present? }
validate :correct_content_typeThe encode_file callback is registered as a before_validation hook, which means it runs before any validations, including the correct_content_type validation that checks file format.
Environment
- Ruby: 3.2.1
- Rails: 7.0.0+ (also affects 5.2, 6.0, 6.1, 7.1)
- ActiveAdmin: 2.9.0
- active_admin_import: 5.1.0 (all versions prior to the fix are affected)
- Database: SQLite3 1.6.0 (but affects all databases)