Genetic Algorithm Wireframing

I created a genetic algorithm capable of generating wireframes for unique web designs. It also uses color contrast to find an appropriate and readable text color.

Links to my github repository for this projectLinks to the live website for this project (if available)Links to the Medium article associated with this project (if available)

Python

Genetic Algorithm Wireframing

Introduction:

Genetic algorithms can be capable of producing unique and artistic outputs when applied to creative fields. In the following sections, I will outline an exploration of genetic algorithms and integer programming as applied to web design. I approach the addition of integer programming into the layout constraints of the websites generated by considering a traditional set packing problem. I plan to implement the genetic algorithm as a heuristic input to the objective function of the optimization problem. I hope that guiding the placement of different objects on the page through integer programming will produce more understandable and usable results. The fitness will also incorporate other design considerations such as color choice for the text and visual hierarchy. The goal of this project is to determine if and to what extent genetic algorithms can be used for creative tasks and how, through solving an optimization problem, they might find unique and accessible design patterns.

Background:

Genetic Algorithms

Genetic algorithms (GAs) are a type of search method consisting of a biologically based system in which a population of genomes is created and modified over time as the genetic structure changes. GAs implement randomized heuristics to allow for more unique and possibly more optimized genomes to be found. These heuristics typically take the form of mutation and crossover where a selected pair of the fittest genomes is used to create a new generation of the population. Thus, new genomes are created based on the fittest genomes of the previous generation. The following image shows a block diagram of this iterative process of a genetic algorithm.

Genetic Algorithm

The mutation changes one or more aspects of a given genome. So, for example, if the genome in question is a binary string, "110," the mutation process might flip one of the bits. We might then get "100." Crossover is a simple process of taking part of one genome and adding it to the corresponding part of another and vice versa. So, for example, we might have "111" and "010," leading to the crossover results of "011" and "110." The following image graphically depicts this process for reference.

Mutation and Crossover

Integer Programming & Optimization

Integer programming is a subset of optimization that limits variables to integer or binary types while subjecting the function to various constraints. The idea is to minimize some type of cost function while meeting certain conditions. One problem that integer programming is commonly used for is something known as set packing. Set packing essentially ensures that all areas are covered by some set, I. For example, consider a cell service network system. We want to minimize the cost of building cell towers in the system but ensure that everyone has access to cell service.

Another typical example in this field is called a "knapsack" problem in which, for example, a hiker must fit four differently weighted objects in a backpack, but can only carry a certain amount of weight. It is a question of cost versus utility in which we hope to maximize utility while meeting the cost constraints (which are weights in the hiker example). This is accomplished through a mathematical formulation of the problem by determining what form of constraints are necessary and how best to maximize the utility.

Web Design

Consider the image below that represents a fairly typical hero section layout; where a hero section is an attention-grabbing part of a landing page. This is the section responsible for bringing business to companies or hiring managers to applicants.

Standard Design Example

The blue rectangle is the navbar, the black rectangle is the text, the pink rectangle represents a button, and the yellow rectangle is the image. As you can see, the page is essentially broken up into columns with text on the left and an image on the right. Further, a traditional placement of the navigation bar is along the top of the page. The text contrasts well with the background and is thus readable and accessible. There are specific requirements for text contrast that fall into a larger collection of accessibility requirements constituted by the Web Content Accessibility Guidelines (WCAG). I will provide a more detailed mathematical discussion of this ruleset when I consider the optimization constraints below, but it is essentially a ratio of the lightness, or luminance, of the text and background colors.

Literature Review:

There are many facets of the creative design of websites using genetic algorithms. Numerous authors have approached the problem from different viewpoints and priorities, specifically about the fitness function design. Oliver, et al, [1] suggest a kind of interactive evaluation of the produced websites in which the user can give their own opinion on the design. This feedback is incorporated into the fitness function of the genetic algorithm, allowing the websites to become more tailored to the user.

In support of this algorithm, Oliver, et. al., [1] also considered two different facets of the design process: style optimization and layout optimization. Each component incorporated its own genetic algorithm to further the flexibility of its work. The authors thought that this would allow the users to choose if they wanted to work with just one form of optimization or both. In terms of style enhancement, a genetic code was created including information about the text and background colors for each object, font options such as font face, boldness, and size, and the text alignment. The authors approached contrast through the calculation of luminance, or essentially the lightness of each color, and comparing it for each color.

Their approach to layout optimization consisted of a genome which was essentially an HTML table element. This allowed different text and image objects to be placed in different rows and columns on the page. Constraints could then be placed on the table ensuring it maintained a certain size and had to be filled. Further constraints were also included on the objects based on how many could be in a table cell and of what type. The authors found some interesting results through this method and determined that their users were benefiting from the interactive style of the fitness function. Oliver, Monmarché, and Venturini [1] suggested the addition of constraints relating two or more objects to each other. They explained that this could indicate an issue with using both an interactive evaluation and an internal fitness evaluation.

Mensch suggested a similar interactive approach to genetic algorithm-generated web designs in his article "Optimizing Website Design Through the Application of an Interactive Genetic Algorithm" [3], but included a kind of starting point for the iterative process. He used a news article site as an example input to the genetic algorithm that consisted of 12 news articles with images and titles. The page also had a top navigation bar. He used an HTML grid to indicate the placement of objects on his site. Like Oliver, et. al, [1] Mensch also considered both style and layout optimization, but combined them into a single genome rather than using two genetic algorithms.

This genome consisted of options for the width of each object (how many columns of the table an object uses), and the background color, box-shadow, font family, weight, size, and font color of both the navigation bar and the content. The content also included a style feature to determine if overlapping is allowed or not. Each component of the genome has only a specific set of values available. For example, the object width can only take up 3, 4, or 5 columns in the table. This method seemed to generate favorable design patterns through user input, but the authors did suggest incorporating other metrics into the fitness function such as conversion rate or the time users spend on a page.

Günthermann and Kingston [2] took a more mathematical approach to the evaluation of their generated designs. They considered elements such as symmetry, the Golden Ratio, and the presence of images. the golden Ratio, according to the authors, is a design tool used to enforce symmetry without having objects be exactly the same size [2]. Its mathematical formulation is fairly simple; it is a comparison of the ’mass’ (area) of each object as follows.

Dm = (m1 - Φ m2)2

This relationship allows the combination of weights on either side of the page to be within a certain ratio of each other. This creates a more balanced layout while keeping some element of symmetry in the design. The authors also used a predetermined starting point, similar to Mensch [3], that included two columns and a header. They planned to match the color scheme of the site to the logo being used in the header. They did this while evaluating color contrast in their fitness assessment, similar to the stylistic optimization technique posed by Oliver, et. al. [1].

My Approach:

I incorporated a combination of the techniques described above in my genetic algorithm formulation. I decided to take inspiration from Günthermann, Kingston, and Mensch in creating a kind of prebuilt layout from which the GA can build on. However, rather than using a more strict interpretation by setting the initial design exactly, I decided to create four components that I planned to include on the website and have those components be randomly sized and placed on the page. I wrote classes for an image, button, text, and navbar component in the hopes of discovering a unique and interesting hero section layout, much like the one shown in the background. Each element has a width and height that can range from 1px up to the size of the page. For simplicity, I decided to create each component as a simple rectangle using pygame. This is representative of a wireframe; a basic layout from which a more detailed web design can be created. On another note, this form of output might lead to a more collaborative creativity emerging with the use of this tool as a starting place for inspiration. The wireframe will have an appearance similar to the image below in which each component is assigned a specific color (except for the text). The color of the text box will vary based on how well it contrasts with a white background. It would probably be worth varying other colors in the design and checking the contrast levels with all overlapping elements in the future, but that was outside of the scope of this project.

Example Output of Genetic Algorithm

Genome Creation

In terms of the setup for this project, a genome represents a single website consisting of the four elements discussed above. Each genome includes a kind of linked list with the page elements and their attributes (e.g. width, height, and location). This linked list will be used for mutation and crossover where one or more page element is modified by randomizing their values (mutation) or by substituting that element for the corresponding element in another site (crossover). The constitution of these lists is described in more detail below.

Fitness Function

Using the setup above as the basis for the genetic algorithm, an optimization problem can be formulated to make up the fitness function. Consider the set of website components below.

Set C = {navigation, button, image, text}

Each component, Ci in the set corresponds to another set of attributes including the x, and y position and the height and width of the rectangle.

Set A = {height, width, x, y}

Thus, let the variable, Xi,j represent the integer value of attribute j for component i. Now each of the constraints can be evaluated as applicable to each component. First, consider the navbar. It is desired that the x-coordinate of the navbar be towards the left side of the screen, which has a maximum width given by W. So, the first constraint should be:

X0,2 ≤ 0.05W

I also wanted the y coordinate of the navigation to be close to the top of the page, so I added the following constraint (where H is the page height).

X0,3 ≤ 0.1H

Further, the width of the navbar should take up most of the page. Consider, then,

0.95W ≤ X0,1 ≤ W

Next, we want a reasonable maximum value for the height also proportionate to the page height, H. Thus, we can add the constraint:

X0,0 ≤ 0.1H

Finally, to maintain a horizontal rectangle, I added a constraint in which the width must be greater than the height of the navbar.

X0,1 > X0,0

Next, consider the text component. There are three main constraints necessary for the reasonable placement of the text. First, the x location should be towards the left side of the page. So,

X3,2 ≤ 0.02W

Additionally, the width should be at least half the width of the page for readability. Thus, the following constraint is added.

X3,1 ≤ 0.5W

Finally, the height of the text should be at least about a third of the height of the page.

X3,0 ≥ 0.3H

Now, consider the image component. The width of the image should be at least 40% of the size of the window, giving:

X2,1 ≥ 0.4W

Further, the x coordinate must be within some range of the right side of the page. So,

X2,2 ≥ 0.5W

Similarly, the height of the image should be at least a third of the height of the page.

X2,0 ≥ 0.3H

Finally, there are two constraints for the button’s width and height. First, I wanted the width to be between:

0.05W ≤ X1,1 ≤ 0.5W

Further, the height of the button should be less than 10% of the page height.

X1,0 ≤ 0.1H

I also decided to include a few constraints that relate two or more components of each website together, respectively. The first concerns the overlapping of the image and text. Essentially, the difference between the x-coordinates should be greater than 0 (no overlap). I think that it would be interesting to experiment with a Euclidean distance in the future to see if more accurate results could be achieved. Currently, I just added the following constraint.

X2,2 − X3,2 ≥ 0

Next, I wanted the button to be under the text, so their y coordinates should be related by the following formula.

0.05W ≤ X3,3 − X1,3 ≤ 0.1W

Further, the navbar should be clearly visible. So, I implemented the following constraint to prevent the other components from overlapping it:

|Xi,2 + 0.5Xi,1| - |X0,2 + 0.5X0,1| ≤ 0.5(Xi,1 + X0,1)

for all i in the set I of components not equal to 0 (the navbar).

I decided to implement the Golden Ratio formula between the image size and the text size, similar to what Günthermann and Kingston described. The Golden Ratio explains that the two elements should have a 1:1.68 ratio (1 : ϕ in Günthermann and Kingston’s case). So, consider the following constraint in which there is some tolerance proportional to the page width.

1.68X3,1X3,0 − 0.02W ≤ X2,1X2,0 ≤ 1.68X3,1X3,0 + 0.02W

Finally, I applied a constraint relative to the amount of white space on the page in the hopes that the elements would fill more of the area. I did this in terms of the uncovered area so that the fitness would decrease proportionally to the white space. This idea is similar to set packing in that I want the components to cover the whole area. The following constraint represents this addition to the fitness function:

W H − area ≤ 0

where the area is calculated in the code based on each component’s location and size taking into account any overlapping. Note that the difference in the area can be less than 0 here because the size of the page elements is already constrained, so the total area can’t exceed the page size.

Therefore, the following integer optimization model can be developed using information from the genetic algorithm to calculate the fitness of each generation (which we essentially hope to maximize based on the constraints). Each constraint will either penalize or reward the objective function depending on if the constraint is met.

max fitness
s.t.: X0,3 ≤ 0.1H (1)
X0,2 ≤ 0.05W (2)
0.95W ≤ X0,1 ≤ W (3)
X0,0 ≤ 0.1H (4)
X0,1 ≥ X0,0 (5)
0.05W ≤ X1,1 ≤ 0.5W (6)
0.01H ≤ X1,0 ≤ 0.1H (7)
X2,1 ≥ 0.4W (8)
X2,2 ≥ 0.5W (9)
X2,0 ≥ 0.3H (10)
X3,2 ≤ 0.02W (11)
X3,1 ≤ 0.5W (12)
X3,0 ≥ 0.3H (13)
X2,2 − X3,2 ≥ 0 (14)
0.05W ≤ X3,3 − X1,3 ≤ 0.1W (15)
|Xi,2 + 0.5Xi,1| - |X0,2 + 0.5X0,1| ≤ 0.5(Xi,1 + X0,1) (16)
1.68X3,1X3,0 − 0.02W ≤ X2,1X2,0 ≤ 1.68X3,1X3,0 + 0.02W (17)
W H − area ≤ 0 (18)
Xi,j ≥ 0 integer ∀i ∈ c, ∀j ∈ A (19)

Experimentation:

The following section contains various experiments with different population sizes, number of generations, and mutation and crossover rates to find an optimal arrangement of parameters. To start, consider an internet with 12 websites. With 300 generations, the following graph can be created depicting the average fitness over the generations. Each data point is the average of the 10 best fitness values. The fitness seems to converge rather quickly. This could be good if the point of convergence is optimal.

Best Fitness Over 300 Generations

To see if this convergence was truly optimal, I next decided to try running the genetic algorithm with 3000 generations. The following graph shows the output of this trial in a similar manner as above. The best fitnesses are averaged every 100 generations in this case. The best fitness did reach a better optimum at about 27, compared with about 20 on the previous run, suggesting that the convergence earlier was to a suboptimal point.

Best Fitness Over 3000 Generations

I also wanted to vary the size of the initial population to see if a higher fitness could be achieved. Setting the number of generations back to 300, I first tried a population size of 36 websites. The following image shows the outcome of the average best fitness. The results of this experiment were interesting. The fitness was more varied over the 300 generations, but did reach a higher value than the previous population size of 12. However, it didn’t really seem to converge in the amount of time given.

Population Size of 36 Websites

I decided to try a larger population size of 60 websites as well. This showed similar variance to the population of 36 websites, but the fitness scores were lower overall as the following graph indicates.

Population Size of 60 Websites

Setting the population size back to 12 websites, I changed the mutation rate to 0.1 and the crossover rate to 0.9. This did show convergence again at a slightly higher value than the original.

90% Crossover & 10% Mutation Rates

Next, I wanted to play with crossover and mutation a little more. I set the crossover value to 0.8 and the mutation to 0.01 to see what effect this might have on the fitness. As the following graph indicates, the average fitness increased in an almost stepwise manner, but remained lower than the original overall. This seems to suggest that the model was attempting to converge on suboptimal points at different stages in the process.

80% Crossover & 1% Mutation Rates

Returning to a more balanced ratio of crossover and mutation, I next decided to try a 75% to 25% mix of the two, respectively. This showed a lot more variance and didn’t really seem to converge on any particular fitness value. Further, the average best fitnesses were significantly reduced as compared to the original or to the case with a 0.9 crossover rate and 0.1 mutation rate. Thus, it seemed like a mutation rate of 0.25 was too high given the scenario.

75% Crossover & 25% Mutation

With the previous results in mind, I concluded that a crossover rate of 90% and a mutation rate of 10% seemed to be the best for this problem. It also seems like the number of generations should be around 1000. To test the population size further, I decided to use these parameters with a population size of 36 to see if there was any improvement. The following plot indicates the results of this. Of the results so far, it seems like this option performed the best. The highest average fitness found was 27, which the model converged on fairly quickly. It seems like this would be a reasonable balance between performance and complexity.

1000 Generations, 36 Websites, 90% Crossover, & 10% Mutation

Finally, I also wanted to test the population size of 60 websites with the same parameters found above.

1000 Generations, 60 Websites, 90% Crossover, & 10% Mutation

As you can see, this did not outperform the version with 36 websites in the population. Therefore, due to the decrease in run time and increase in efficiency, I will move forward with the previous model with a population size of 36, 1000 generations, a mutation rate of 10%, and a crossover rate of 90%.

Results:

The following image shows one of the outputs of running the genetic algorithm with a population size of 36 websites, a mutation rate of 0.1, and a crossover rate of 0.9. While this isn’t exactly what I was expecting in terms of the design, it does seem like something that could feasibly be used as a real website.

Best Output

The pink rectangle is the button, the yellow rectangle represents an image, the blue rectangle is the navbar, and the maroon rectangle corresponds to text. The button is larger than usual, but the text and image are reasonable. They overlap in a unique way, which I think could lead to a more interesting layout. The navbar is too small to be a traditional desktop navigation, but would work as a hamburger menu that, when clicked, opens a full screen menu. I think that this could be used as a good starting point to build on. I created a quick representation of these rectangles as an actual web design using Figma (a design software). While I would still change the layout further, I think that this basic design shows promise. I used the same color generated for the text in the result below. I checked the contrast of the whole design using a plugin and found that all of the text was AAA rated (it contrasted very well). So, I think that the fitness of 25 was fairly indicative of the results. This image was found after 600 generations of the GA.

Example of Web Design from GA Wireframe

At the 800th generation of this run, the genetic algorithm created the following design. Similar to the one above, this seems like it could feasibly be interpreted as a usable website. There are similarities with the firstexample, suggesting the presence of mutation. The image and navbar seem to be in the same location, butthe text and button mutated. The previous design did receive a slightly higher fitness value at 25, whereas this design was given a fitness of 24. However, I personally would prefer the latter design. This suggests the importance of potentially incorporating an interactive evaluation as many of the authors in the literature review did.

Notable Output

Like the previous version, I also created a more flushed out web design in Figma based on the generated wireframe. This is shown below. I think it turned out well, but there are definitely some interesting design decisions. Particularly, the button is very long and skinny. I wasn’t sure how to interpret it at first. I thought about using vertical text, but it took too much attention away from the header and was difficult to read. I decided to use an arrow telling the user to continue scrolling.

Another Example Design from GA Wireframe

Concluding Thoughts & Future Work:

The aim of this project was to determine the feasibility of using integer programming and genetic algorithms to create artistic and unique web designs. After experimenting with different fitness evaluations and crossover/mutation rates, I would argue that this type of creative task can be aided by the use of a genetic algorithm. Of course, my implementation only used very basic wireframing techniques, but I think that this could have merit in the partnership between artificial intelligence and the designer. I do think that incorporating some kind of interactive evaluation, like Oliver, et. al., and Mensch suggested, would aid in the further development of this algorithm. I think that this could help designers connect with their users by gaining a better understanding of the types of layouts they prefer. I think that using a similar mathematical fitness formulation to generate an initial population from which the user can choose their favorite website would be a unique way to combine the two techniques. The user’s evaluation could then be added to the existing fitness before selecting which websites to perform crossover and mutation on. Another possibility that I think would be very interesting to explore is the application of deep learning in modeling user feedback. This could be used similarly to an interactive GA, but would be useful in situations where surveying many users is infeasible.

Refernces:

[1] A. Oliver, A. Monmarché, G. V. Interactive design of web sites with a genetic algorithm. IADIS International Conference (2002).

[2] Lukas Günthermann, J. K. Designing a website using a genetic algorithm. Artificial Intelligence XXXV (2018), 389–402.

[3] Mensch, E. P. Optimizing website design through the application of an interactive genetic algorithm. Senior Projects Spring 2016 (2016