Task Entities for Clean Architecture and ML
The core of our project is going to be the tasks, and we are going to define this core in our first issue. But to do it, we have to keep in mind the following things:
- We are implementing Clean Architecture, our tasks are going to be our business entities and should be independent of any implementation.
- To be able to use Machine Learning with our tasks, we need them to be parameterized, and the properties that we use should be the ones that we consider relevant for the project.
I'm going to expand on the Clean Architecture point first.
With traditional software development, our first step would be to define the models for our tasks. That is a good first step, but it has a fatal flaw: the models are tied to a specific database system.
If we want to keep our architecture clean, we shouldn't define models but entities that are independent of any database. Lucky for us, in Python, we have two good options to be able to do this: Dataclasses and Pydantic. Honestly, I like Pydantic better, but it comes with some extra features that we won't be using, so I'm going to go with Dataclasses for this one. Let's consider it an experiment, and we will see later if it was the right decision.
And what are we going to do with the database? This is going to come later on a Storage layer. It breaks a lot of conventions, but we are trying to check if this type of development is worth it in the long run.
Then we go to our second point: Machine Learning parameters.
This is very important. We already decided that we are going to be using AI to manage our tasks and make suggestions, so we need to prepare for it from the beginning.
Machine Learning works with data. If we don't have data, we won't be able to make any predictions, and if we don't have good data, our suggestions are going to be useless. To have good data, we need to know more about the tasks that we are handling.
Right now, our objective is to be able to suggest a task schedule for the day. To do this suggestion, we need to consider things like:
- When is the task due? It is something that should be done today or do we have some flexibility with the time?
- How much time will it take to do this task? It is something that can be done in five minutes? Or does it take the whole day?
- How hard is it for me to handle this task? Does it make me happy to work on it? Do I stress out just thinking about it?
- How important is this task? Is it something that must be done? Or is it more of an optional thing?
Some of these parameters are hard to figure out and is probably going to be a hassle to ask for this information all the time. That's where is possible to add a second layer of predictions in the future, trying to guess the right parameters just from the task description, but that should be for another day.
With all of this in mind, we will be able to work on the first version of our entities. Is important to consider that we will be just focusing on the "task list" business domain, so it shouldn't be a big change.
Our PR itself turned out to be pretty short: https://github.com/andres-javier-lopez/taskmaster/pull/2. I ended up defining a single entity for it, the Task
entity, but added more business logic to it than I initially expected.
@dataclass
class Task(Entity):
"""Represent a pending task on our list"""
# required fields
title: str
# optional fields
description: str = ""
status: TaskStatus = TaskStatus.queued
due_date: Optional[date] = None # final date for this task to be done
scheduled_date: Optional[date] = None # scheduled date to work on task
scheduled_time: Optional[time] = None # optional time to work on task
estimated_effort_hours: Optional[float] = None
must_be_done: bool = True
task_mood: Mood = Mood.neutral
@property
def is_open(self):
return self.status in [
TaskStatus.queued,
TaskStatus.scheduled,
TaskStatus.overdue,
]
def schedule_for(self, date: date, time: Optional[time] = None):
if self.status in [TaskStatus.queued, TaskStatus.scheduled]:
self.status = TaskStatus.scheduled
self.scheduled_date = date
self.scheduled_time = time
def unschedule(self):
if self.status == TaskStatus.scheduled:
self.status = TaskStatus.queued
self.scheduled_date = None
self.scheduled_time = None
def must_be_finished_before(self, date: date):
self.due_date = date
self.must_be_done = True
def drop_it_after(self, date: date):
self.due_date = date
self.must_be_done = False
def update_status_for_date(self, date: date):
if self.due_date <= date and self.status != TaskStatus.finished:
if self.must_be_done:
self.status = TaskStatus.overdue
else:
self.status = TaskStatus.dropped
Also, I kept getting sidetracked with setting up tools that I initially forgot about like configuring unit tests and adding linters.
And what is this entity useful for? That will become more apparent in our next additions: the Task manager and Task storage. Will be writing on them next.