AI method: In this paper, which followed Davoult et al. (2024), we formalise the use of planetary system architecture to train a Random Forest Classifier to predict if an observed planetary system is likely to harbour an Earth-like planet.
Paper Abstract: In this paper, we aim at predicting which stars are most likely to host an Earth-like planet (ELP) to avoid blind searches, minimises detection times, and thus maximises the number of detections. We trained a Random Forest to recognise and classify systems as ‘hosting an ELP’ or ‘not hosting an ELP’. The Random Forest was trained and tested on populations of synthetic planetary systems derived from the Bern Model, and then applied to real observed systems.
The tests conducted on the machine learning model yield precision scores of up to 0.99, indicating that 99% of the systems identified by the model as having ELPs possess at least one. Among the few real observed systems that have been tested, eight have been selected as having a high probability of hosting an ELP, and a quick study of the stability of these systems confirms that the presence of an Earth-like planet within them would leave them stable.
The excellent results obtained from the tests conducted on the ML model demonstrate its ability to recognise the typical architectures of systems with or without ELPs within populations derived from the Bern Model. If we assume that the Bern Model adequately describes the architecture of real systems, then such a tool can prove indispensable in the search for Earth-like planets.
List of planetary systems with a probability of harbouring an Earth-like planet is larger than 90%, according to our Random Forest Classifier. Adapted from Davoult et al. (2025), Astronomy and Astrophysics, in press, DOI: 10.1051/0004-6361/202452434
AI method In this paper, we extend our classification scheme for exoplanetary systems. This is the basis of the upcoming paper Davoult et al. (2025).
Paper Abstract (abridged):
This study aims to identify the profile of a typical system that harbours an ELP by investigating the architecture of systems and the properties of their innermost detectable planets. Here, we introduce a novel method for determining the architecture of planetary systems and categorising them into four distinct classes. We then conduct a statistical study to identify the most favourable arrangements for the presence of an ELP. Methods. Using three populations of synthetic planetary systems generated using the Bern model around three different types of stars, we studied the 'theoretical' architecture (the architecture of a complete planetary system) and the 'biased' architecture (the architecture of a system in which only detectable planets are taken into account after applying an observation bias) of the synthetic systems. To describe a typical system hosting an ELP, we initially examined the distribution of ELPs across different categories of architectures, highlighting the strong link between planetary system architecture and the presence of an ELP. A more detailed analysis was then conducted, linking the biased architecture of a system with the physical properties of its innermost observable planet to establish the most favourable conditions for the presence or absence of an ELP in a system.
We show in this paper that the detections of ELPs can be predicted thanks to the already known properties of their systems, and we present a list of the properties of the systems most likely to host such a planet.
Confusion matrix representing the class change of each population with observational bias for the synthetic planetary system population assuming a solar-type central star. The matrix is normalised on the biased architecture (lines). As an example, 85% of biased Anti-Ordered systems around G stars have an Anti-Ordered theoretical architecture. Adapted from Davoult et al. (2024). Astronomy & Astrophysics, Volume 689, id.A309.
AI method: This second paper explores the relative influence of initial conditions versus evolution in the architecture of planetary systems. It follows closely the framework developed in Paper I.
Paper Abstract: In the first paper of this series (see below), we proposed a model-independent framework for characterising the architecture of planetary systems at the system level. There are four classes of planetary system architecture: similar, mixed, anti-ordered, and ordered. In this paper, we investigate the formation pathways leading to these four architecture classes. To understand the role of nature versus nurture in sculpting the final (mass) architecture of a system, we apply our architecture framework to synthetic planetary systems - formed via core-accretion - using the Bern model. General patterns emerge in the formation pathways of the four architecture classes. Almost all planetary systems emerging from protoplanetary disks whose initial solid mass was less than one Jupiter mass are similar. Systems emerging from heavier disks may become mixed, anti-ordered, or ordered. Increasing dynamical interactions (planet-planet, planet-disk) tends to shift a system's architecture from mixed to anti-ordered to ordered. Our model predicts the existence of a new metallicity-architecture correlation. Similar systems have very high occurrence around low-metallicity stars. The occurrence of the anti-ordered and ordered classes increases with increasing metallicity. The occurrence of mixed architecture first increases and then decreases with increasing metallicity. In our synthetic planetary systems, the role of nature is disentangled from the role of nurture. Nature (or initial conditions) pre-determines whether the architecture of a system becomes similar; otherwise nurture influences whether a system becomes mixed, anti-ordered, or ordered. We propose the `Aryabhata formation scenario' to explain some planetary systems which host only water-rich worlds. We finish this paper with a discussion of future observational and theoretical works that may support or refute the results of this paper.
Emergence of formation pathways: Sankey diagram depicting the emergence of formation pathways of architecture classes. The thickness of the links and nodes is proportional to the relative number of synthetic systems in our simulation. This result is derived from syn- thetic planetary systems around a solar mass star via the Bern model. Disk gas mass and metallicity are binned at their median values.. Adapted from Mishra et al. (2023), Astronomy & Astrophysics, Volume 670, id.A69,
10.1051/0004-6361/202244705
AI method: This paper is the foundation of future work considering multi-planetary systems as single entities (and not just collection of planets). As these entities live in high-dimension spaces, they require novel ML/AI methods to be studied.
Paper Abstract: We present a novel, model-independent framework for studying the architecture of an exoplanetary system at the system level. This framework allows us to characterise, quantify, and classify the architecture of an individual planetary system. Our aim in this endeavour is to generate a systematic method to study the arrangement and distribution of various planetary quantities within a single planetary system. We propose that the space of planetary system architectures be partitioned into four classes: similar, mixed, anti-ordered, and ordered. We applied our framework to observed and synthetic multi-planetary systems, thereby studying their architectures of mass, radius, density, core mass, and the core water mass fraction. We explored the relationships between a system's (mass) architecture and other properties. Our work suggests that: (a) similar architectures are the most common outcome of planet formation; (b) internal structure and composition of planets shows a strong link with their system architecture; (c) most systems inherit their mass architecture from their core mass architecture; (d) most planets that started inside the ice line and formed in-situ are found in systems with a similar architecture; and (e) most anti-ordered systems are expected to be rich in wet planets, while most observed mass ordered systems are expected to have many dry planets. We find, in good agreement with theory, that observations are generally biased towards the discovery of systems whose density architectures are similar, mixed, or anti-ordered. This study probes novel questions and new parameter spaces for understanding theory and observations. Future studies may utilise our framework to not only constrain the knowledge of individual planets, but also the multi-faceted architecture of an entire planetary system. We also speculate on the role of system architectures in hosting habitable worlds.
Classes of architecture. This schematic diagram shows the four architecture classes: similar, anti-ordered, mixed, and ordered. Depend- ing on how a quantity (e.g. mass or size) varies from one planet to an- other, the architecture of a system can be identified. Adapted from Mishra et al. (2023), Astronomy & Astrophysics, Volume 670, id.A68 doi:10.1051/0004-6361/202243751
Our contribution: In this paper, we define a new metric to compare properties of planetary systems. We then show, using the un-supervised visualisation technique T-SNE, that the architecture of planetary systems is related to the properties of circum-stellar disks in which they form.
Paper Abstract: Planet formation models now often consider the formation of planetary systems with more than one planet per system. This raises the question of how to represent planetary systems in a convenient way (e.g. for visualisation purpose) and how to define the similarity between two planetary systems, for example to compare models and observations.
We define a new metric to infer the similarity between two planetary systems, based on the properties of planets that belong to these systems. We then compare the similarity of planetary systems with the similarity of protoplanetary discs in which they form.
We first define a new metric based on mixture of Gaussians, and then use this metric to apply a dimensionality reduction technique in order to represent planetary systems (which should be represented in a high-dimensional space) in a two-dimensional space. This allows us study the structure of a population of planetary systems and its relation with the characteristics of protoplanetary discs in which planetary systems form.
We show that the new metric can help to find the underlying structure of populations of planetary systems. In addition, the similarity between planetary systems, as defined in this paper, is correlated with the similarity between the protoplanetary discs in which these systems form. We finally compare the distribution of inter-system distances for a set of observed exoplanets with the distributions obtained from two models: a population synthesis model and a model where planetary systems are constructed by randomly picking synthetic planets. The observed distribution is shown to be closer to the one derived from the population synthesis model than from the random systems.
The new metric can be used in a variety of unsupervised machine learning techniques, such as dimensionality reduction and clustering, to understand the results of simulations and compare them with the properties of observed planetary systems.
T-SNE representation based on the distance in the space of planetary systems (upper panel) and distance in the space of disc parameters (lower panel). The upper panel is similar to Fig. 8, upper panel, except that the colour code is here only linked to the posi- tion of the point on the plot. On the lower panel, we used T-SNE based on the similarity resulting from the metric in the space of disc parameters to represent systems. The colour code indicates in which part of the upper panel the same system is represented. In the lower panel, two points located close one to the other represents planetary systems formed in similar discs, whereas two points with similar colours represent planetary sys- tems that are themselves similar. Adapted from Alibert et al., (2019),
doi:10.1051/0004-6361/201834592