Top 5 Model Merge Methods Compared: SLERP, TIES, DARE & More
Choosing the right merge method can make or break your result. In this post we break down the five most widely-used techniques and help you pick the right one for your use case.
1. Linear Interpolation (Weighted Average)
The simplest method: for each parameter, compute a weighted average across the source models. It's fast, predictable, and works well when merging just two closely related models. The downside? It tends to dilute specialized knowledge when you add more than two or three sources.
2. SLERP (Spherical Linear Interpolation)
SLERP treats the weight tensors as points on a hypersphere and interpolates along the geodesic between them. This preserves the magnitude of the weights better than linear interpolation, which helps maintain model quality. It's limited to merging exactly two models at a time, but you can chain multiple SLERP merges.
3. TIES-Merging
TIES (TrIm, Elect Sign & merge) tackles the interference problem head-on. It first trims the smallest-magnitude changes, then resolves sign conflicts among the remaining deltas, and finally merges them. The result: each model's unique contributions come through more clearly. TIES is excellent when you're combining three or more models with different specializations.
4. DARE (Drop And REscale)
DARE randomly drops a fraction of each model's delta parameters before merging, then rescales the remaining ones. Think of it as dropout but for merge deltas. This creates "room" in the merged model for contributions from more sources. DARE has been shown to allow merging of many models simultaneously without the quality degradation you'd see with naive averaging.
5. Passthrough (Frankenmerging)
Instead of combining weights at each layer, passthrough stacking takes entire layers from different models and concatenates them. The result is a model that's physically larger than any parent. This is how the community has built 120B+ parameter models from 70B ingredients. It's unconventional and requires careful layer selection, but the results speak for themselves on the leaderboards.
Which Method Should You Use?
There's no universal answer, but here are some rules of thumb: start with SLERP for two-model merges, move to TIES or DARE when adding more models, and consider passthrough if you're chasing absolute performance and have the VRAM to serve a larger model. The best approach is often to experiment — and soon, MergeKit's recipe registry will make it easy to see which methods work best for specific model families.