Best practices for creating custom models

Last updated on Oct 1, 2025

Learn about the best practices to effectively prepare and create custom models.

Pick a strong use case

Use case

Good

Avoid

Lifestyle photography

  • Clear, in-focus people
  • Natural lighting and authentic expressions
  • Variety of poses and compositions
  • Simple or softly blurred backgrounds
  • Blurry or pixelated images
  • Harsh filters or extreme color grading
  • Overcrowded scenes or distracting backgrounds
  • Group shots where faces are too small to see clearly

Photoshoot of a person

  • Sharp, well-lit close-ups and mid-distance shots
  • Variety of poses, expressions, and outfits
  • Consistent lighting and environment
  • Clean or softly blurred backgrounds
  • Faces too small or partially obscured
  • Heavy shadows or harsh lighting
  • Too many similar shots
  • Blurry or low-quality images

Still life photography

  • Even, well-balanced lighting and shadows
  • Consistent style and color palette
  • Variety of compositions and arrangements
  • Sharp focus on primary subjects
  • Low-quality or blurry images
  • Product shots with logos or packaging
  • Distracting props or backgrounds
  • Unrelated or off-theme subjects

Illustrated character

  • Accurate anatomy and proportions
  • Consistent style and rendering quality
  • Variety of poses and expressions
  • Clear details without visual clutter
  • Low-quality or incomplete illustrations
  • Inconsistent styles or rendering
  • Limited variety in poses or perspectives
  • Distracting backgrounds or unrelated elements

Iconography

  • Clear, consistent icon style
  • Medium-to-high complexity designs
  • Consistent lighting and color palette
  • Clean, object-focused compositions
  • Low-quality or blurry icons
  • Unrelated or off-theme concepts
  • Inconsistent styles or rendering methods
  • Overly specific colors and design elements

Brand illustrations

  • Accurate anatomy and proportions
  • Consistent style and rendering quality
  • Variety of poses and expressions
  • Clear details without visual clutter
  • Low-quality or incomplete illustrations
  • Inconsistent styles or rendering
  • Limited variety in poses or perspectives
  • Distracting backgrounds or unrelated elements

3D graphics

  • Consistent perspective and proportions
  • Cohesive style, lighting, and rendering quality
  • Variety of compositions and angles
  • Clear, uncluttered designs
  • Low-quality or incomplete renders
  • Inconsistent styles or perspectives
  • Limited variety in angles or compositions
  • Distracting elements or unrelated objects

New brand expression illustrations

  • Strong, consistent brand style throughout
  • Clear compositions with room to breathe
  • Expressive, on-brand characters and scenes
  • Clean rendering with balanced lighting
  • Mixed styles or inconsistent perspectives
  • Crowded scenes with unclear focus
  • Off-brand props or unrelated visuals
  • Incomplete or low-quality illustrations

New concepts

  • Visually distinct and well-executed concepts
  • Consistent structure, lighting, and detail
  • Strong form and clear silhouette
  • High-quality images with clean rendering
  • Repeated shapes or minor variations
  • Distracting backgrounds or details
  • Incomplete or low-quality renders
  • Mixed rendering styles or effects

Use quality images

  • Use JPG or PNG files.
  • Choose at least 10-30 high-quality images that showcase the brand-specific styles and concept subjects you want to achieve.
  • Capture a varied set of images representing the style or subject.
  • Ensure that each image file size does not exceed 50 MB.
  • Ensure the images have a resolution higher than 1024x1024 pixels with a maximum 16:9 aspect ratio for landscape or 9:16 for portrait. 
  • Keep the aspect ratio consistent with the training dataset. If the training set is in portrait, and you generate square images, they will have cut-off issues upon generation. 
  • Crop your sample images to focus on the most important visual elements. For example, exclude images that show a person or character off in the distance with a small face or body.
  • Include images displaying various viewpoints and backgrounds while maintaining a consistent aesthetic.
  • Make sure your images don’t include an unintentional pattern that you do not want, such as having a white background in every image.
  • Remove distracting elements that you do not want the model to learn, such as a collage in the background of a portrait or a hat on a character.

Review Model Tags

  • Include permanent attributes of the subject or style you're training a model on, such as brown hair for a brunette character.
  • Do not include changeable attributes in Tags, like what object a character is holding.
  • Include a minimum of three Model Tags.

Review Captions

  • Use captions to enhance detail and train the custom models on concepts you want the model to generate. 
  • Keep image captions concrete and descriptive, using language that you will use when prompting with the model.
  • Vary sentence structure across all your image captions.
  • Modify auto captions as needed to inform the model of the details of the concept.
  • The Firefly base model does not know famous people or places, so captions should include descriptions of these places to improve potential outcomes.

Use crisp prompts that align with your training data

  • Include similar words and phrases in your prompts that you used in captions.
  • Prompting with concepts that align with what you trained the custom model on better preserves that identity than prompting with unrelated concepts (i.e., asking for a black-and-white illustrated rocketship from a model trained on colorful lifestyle photography).

Use advanced style capabilities to further refine your image

  • The Visual intensity slider is set to the lowest by default for optimal identity preservation. However, for creative use cases such as Style reference, increasing visual intensity can produce more vibrant results.  
  • When using Composition references for subjects, opt for images with white backgrounds or sketches depicting the subject in the desired pose.