How To Use A Python Library That Is Constantly Changing In A Docker Image Or New Container?

December 21, 2023 Post a Comment

I organize my code in a python package (usually in a virtual environment like virtualenv and/or conda) and then usually call: python develop so

Solution 1:

During development it is IMO perfectly fine to map/mount the hostdirectory with your ever changing sources into the Docker container. The rest (the python version, the other libraries you are dependent upon you can all install in the normal way in the the docker container.

Once stabilized I remove the map/mount and add the package to the list of items to install with pip. I do have a separate container running devpi so I can pip-install packages whether I push them all the way to PyPI or just push them to my local devpi container.

Doing speed up container creation even if you use the common (but more limited) python [path_to_project/setup.py] develop. Your Dockerfile in this case should look like:

# the following seldom changes, only when a package is added to setup.py
 COPY /some/older/version/of/project/plus/dependent/packages /older/setup
 RUN pip /older/setup/your_package.tar.gz

 # the following changes all the time, but that is only a small amount of work
 COPY /latest/version/of/project     
 RUN python [path_to_project/setup.py] develop

If the first copy would result in changes to files under /older/setup then the container gets rebuilt from there.

Running python ... develop still makes more time and you need to rebuild/restart the container. Since my packages all can also be just copied in/linked to (in addition to be installed) that is still a large overhead. I run a small program in the container that checks if the (mounted/mapped) sources change and then reruns anything I am developing/testing automatically. So I only have to save a new version and watch the output of the container.

Solution 2:

For deployment/distribution, it would be seamless to have a docker image for your package. If not as an image, you need to transfer you source code to the environment where it needs to be run, configure a volume to have the source inside container so that it can be built etc., With image its just pull and run a container out of it.

But for ease and to get rid of manual steps in building the image, consider using docker-compose.

docker-compose.yml may look like this:

ml_experiment:
  build: <path/to/Dockerfile>
  volumes:
    - ~/data/:/data
  command: ["python", "run_ML_experiment_file.py"]

Now to build an image and bring up a container you just need to do

docker-compose up --build

The option --build is must to rebuild the image each time, else docker-compose chooses to use the image already built

Refer https://docs.docker.com/compose/

Solution 3:

This may be reiterating some content from other good answers here, but here is my take. To clarify what I think your goals are, you want to 1) run the container without rebuilding it each time, and 2) have your most recent code be used when you launch the container.

To be blunt, achieving both (1) and (2) cannot be done without using a bind mount (-v host/dir:/docker/dir), ENV variables to switch between code versions as is here, or building separate dev and production images. I.e., you cannot achieve both by using COPY, which would only get (2).

Note that this part of the philosophy of containers: they are meant to "freeze" your software exactly how it was when you built the image. The image itself is not meant to be dynamic (which is why containers are so great for reproducing results across environments!); to be dynamic, you must use bind mounts or other methods.

You can nearly achieve both goals if you do not mind doing a (quick) rebuild of your image each time; this is what Anthon's solution will provide. These rebuilds would be fast if you structure your code changes appropriately and make sure not to modify anything that is built earlier in the Dockerfile. This ensures that the preceding steps are not re-run each time you create a new image (since docker build ignores steps that have not changed).

With that in mind, here is a way to use COPY and docker build -f ... to achieve (2) only.

Baca Juga

Note that again, this will require rebuilding the image each time since COPY will copy a static snapshot of whatever directory you specify; updates to that directory after running docker build ... will not be reflected.

Assuming that you will build the image while in your code directory (not your home directory*), you could add something like this to the end of the Dockerfile:

COPY . /python_app
ENTRYPOINT python /python_app/setup.py develop

and then build the image via:

docker build -t your:tag -f path/to/Dockerfile .

Note that this may be slower than Anthon's method since each rebuild would involve the entire code directory, rather than just your most recent changes (provided that you structure your code into static and development partitions).

*N.b. it is generally not advisable to COPY a large directory (e.g. your full home directory) since it can make the image very large (which may slow down your workflow when running the image on a cluster due to limited bandwidth or I/O!).

Regarding the apt-get update comment in your post: running update in the container ensures that later installs won't be using an old package index. So doing update is good practice since the source upstream image will generally have older package indexes meaning that an install may fail without a prior update. See also In Docker, why is it recommended to run `apt-get` update in the Dockerfile?.

Solution 4:

I think you are looking for bind mounts Docker feature. Check this docs: Use Bind Mounts . Using this you may just mount the host directory with your constantly changing python scripts and it will be available in the container. If you need to mount only some specific directory with the constantly changing scripts I would additionally make use of PIP command pip install -r requirements.txt and combine all your packages into the single requirements.txt file (as I see you repeat RUN pip3 install ... in your Dockerfile).

Solution 5:

I usually use Dockerfile

############ BUILDER ############# pull official base image
FROM python:3.8.3-slim as builder

# set work directory
WORKDIR /usr/src/app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install psycopg2 dependencies
RUN apt-get update \
    && apt-get -y install libpq-dev gcc \
    python3-dev musl-dev libffi-dev\
    && pip install psycopg2

# lint
RUN pip install --upgrade pip
COPY . .

# install dependencies
COPY ./requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /usr/src/app/wheels -r requirements.txt

# copy project
COPY . .

########## FINAL ########### pull official base image
FROM python:3.8.3-slim

# create directory for the app user
RUN mkdir -p /home/app

# create the app user
RUN addgroup --system app && adduser --system --group app

# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir$APP_HOME
RUN mkdir$APP_HOME/static
RUN mkdir$APP_HOME/media
RUN mkdir$APP_HOME/currencies
WORKDIR $APP_HOME# install dependencies
RUN apt-get update && apt-get install -y libpq-dev bash netcat rabbitmq-server
COPY --from=builder /usr/src/app/wheels /wheels
COPY --from=builder /usr/src/app/requirements.txt .
COPY wait-for /bin/wait-for
COPY /log /var/log
COPY /run /var/run

RUN pip install --no-cache /wheels/*

# copy project
COPY . $APP_HOME# chown all the files to the app user
RUN chown -R app:app $APP_HOME
RUN chown -R app:app /var/log/
RUN chown -R app:app /var/run/

EXPOSE 3000

# change to the app user
USER app

# only for dgango
CMD ["gunicorn", "Config.asgi:application", "--bind", "0.0.0.0:8000", "--workers", "3", "-k","uvicorn.workers.UvicornWorker","--log-file","-"]

docker-compose.yml

# docker-compose.ymlversion:"3.7"services:db:container_name:postgreshostname:postgresimage:postgres:12volumes:-postgres_data:/var/lib/postgresql/data/env_file:-.env.prod.dbnetworks:-mainrestart:alwayspgbackups:container_name:pgbackupshostname:pgbackupsimage:prodrigestivill/postgres-backup-localrestart:alwaysuser:postgres:postgres# Optional: see belowvolumes:-./backups:/backupslinks:-dbdepends_on:-dbenv_file:.env.prod.dbnetworks:-mainweb:build:.container_name:webexpose:-8000command:sh-c"wait-for db:5432\
      && python manage.py makemigrations&&python manage.py migrate&&gunicorn Config.asgi:application --bind 0.0.0.0:8000 -w 3 -k uvicorn.workers.UvicornWorker --log-file -"volumes:-static_volume:/home/app/web/static-media_volume:/home/app/web/mediaenv_file:-.env.prodhostname:webimage:web-imagenetworks:-maindepends_on:-dbrestart:alwaysprometheus:container_name:prometheusimage:prom/prometheushostname:prometheusvolumes:-./prometheus/:/etc/prometheus/ports:-9090:9090networks:-maindepends_on:-webrestart:alwaysgrafana:container_name:grafanaimage:grafana/grafana:6.5.2hostname:grafanaports:-3060:3000networks:-maindepends_on:-prometheusrestart:alwaysnginx:container_name:nginximage:nginx:alpinehostname:nginxvolumes:-./nginx/nginx.conf:/etc/nginx/conf.d/default.conf-./wait-for:/bin/wait-for-static_volume:/home/app/web/static-media_volume:/home/app/web/mediaports:-80:80depends_on:-webnetworks:-mainrestart:alwaysnetworks:main:driver:bridgevolumes:static_volume:media_volume:postgres_data:

wait-for

#!/bin/sh

TIMEOUT=120
QUIET=0

echoerr() {
  if [ "$QUIET" -ne 1 ]; thenprintf"%s\n""$*" 1>&2; fi
}

usage() {
  exitcode="$1"cat << USAGE >&2
Usage:
  $cmdname host:port [-t timeout] [-- command args]
  -q | --quiet                        Do not output any status messages
  -t TIMEOUT | --timeout=timeout      Timeout in seconds, zero for no timeout
  -- COMMAND ARGS                     Execute command with args after the test finishes
USAGEexit"$exitcode"
}

wait_for() {
  for i in `seq$TIMEOUT` ; do
    nc -z "$HOST""$PORT" > /dev/null 2>&1

    result=$?
    if [ $result -eq 0 ] ; thenif [ $# -gt 0 ] ; thenexec"$@"fiexit 0
    fisleep 1
  doneecho"Operation timed out" >&2
  exit 1
}

while [ $# -gt 0 ]
docase"$1"in
    *:* )
    HOST=$(printf"%s\n""$1"| cut -d : -f 1)
    PORT=$(printf"%s\n""$1"| cut -d : -f 2)
    shift 1
    ;;
    -q | --quiet)
    QUIET=1
    shift 1
    ;;
    -t)
    TIMEOUT="$2"if [ "$TIMEOUT" = "" ]; thenbreak; fishift 2
    ;;
    --timeout=*)
    TIMEOUT="${1#*=}"shift 1
    ;;
    --)
    shiftbreak
    ;;
    --help)
    usage 0
    ;;
    *)
    echoerr "Unknown argument: $1"
    usage 1
    ;;
  esacdoneif [ "$HOST" = "" -o "$PORT" = "" ]; then
  echoerr "Error: you need to provide a host and port to test."
  usage 2
fi

wait_for "$@"

nginx/nginx.conf

# nginx.conf

upstream back {
    server web:8000;
}
server {

    listen 80;

    location / {
        proxy_pass http://back;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_redirect off;
    }

    location /static/ {
     root /home/app/web/;
    }

    location /media/ {
     root /home/app/web/;
    }

}

prometheus/prometheus.yml

global:scrape_interval:10sevaluation_interval:10sexternal_labels:monitor:django-monitorscrape_configs:-job_name:"main-django"metrics_path:/metricstls_config:insecure_skip_verify:truestatic_configs:-targets:-host.docker.internal-job_name:'prometheus'scrape_interval:10sstatic_configs:-targets: [ 'host.docker.internal:9090' ]

.env.prod is unique for your project

.env.prod.db

POSTGRES_USER=
POSTGRES_PASSWORD=
POSTGRES_HOST=
POSTGRES_EXTRA_OPTS="-Z6 --schema=public --blobs"SCHEDULE="@every 0h30m00s"BACKUP_KEEP_DAYS=7BACKUP_KEEP_WEEKS=4BACKUP_KEEP_MONTHS=6HEALTHCHECK_PORT=8080

Project run

docker build -t web-image .
docker-compose up

Project update

docker-compose up -d --build
docker-compose up

Run script

docker-compose exec web {script}

Swarm

docker swarm init --advertise-addr 127.0.0.1:2377
docker stack deploy -c docker-compose.yml  proj

Remove swarm

docker stack rm proj
docker swarm leave --force

Getting Started with Python